From: Robert Mustacchi <rm@fingolfin.org>
To: illumos-developer <developer@lists.illumos.org>,
Marcel Telka <marcel@telka.sk>
Subject: Re: [developer] CPU usage - clock related issue
Date: Sun, 28 Jul 2024 10:50:40 -0700 [thread overview]
Message-ID: <e32abba4-e351-4907-986a-ed442749a49a@fingolfin.org> (raw)
In-Reply-To: <4d9797a5-b68c-4c50-9a13-66a9d0d63bd0@fingolfin.org>
On 7/28/24 09:37, Robert Mustacchi wrote:
> On 7/27/24 22:56, Marcel Telka wrote:
>> Hi,
>>
>> It looks like something went wrong between changesets 6e0c6e37fb and
>> 8b913f79fc in the illumos-gate.
>>
>> After upgrade of OpenIndiana from
>> osnet-incorporation@0.5.11-2024.0.0.22264 (illumos-6e0c6e37fb)
>> to
>> osnet-incorporation@0.5.11-2024.0.0.22271 (illumos-8b913f79fc)
>>
>> I see two processes eating full CPU: nwamd and mariadbd. The machine is
>> a qemu/kvm quest (host is Rocky 9).
>>
>> # dtrace -n 'profile-101 /pid == $target/ { @[ustack()] = count() } tick-10s{exit(0)}' -p $(pgrep -x nwamd) | tail -n 40
>> dtrace: description 'profile-101 ' matched 2 probes
>> nwamd`in_past+0x23
>> nwamd`nwamd_event_dequeue+0x1bf
>> nwamd`nwamd_event_handler+0x162
>> nwamd`main+0x1b0
>> nwamd`_start_crt+0x9a
>> nwamd`_start+0x1a
>> 24
>>
>> nwamd`in_past+0x26
>> nwamd`nwamd_event_dequeue+0x1bf
>> nwamd`nwamd_event_handler+0x162
>> nwamd`main+0x1b0
>> nwamd`_start_crt+0x9a
>> nwamd`_start+0x1a
>> 25
>>
>> libc.so.1`__cp_gethrtime+0x5e
>> libc.so.1`__cp_clock_gettime_realtime+0x77
>> libc.so.1`__clock_gettime+0x72
>> libc.so.1`clock_gettime+0x26
>> nwamd`in_past+0x23
>> nwamd`nwamd_event_dequeue+0x1bf
>> nwamd`nwamd_event_handler+0x162
>> nwamd`main+0x1b0
>> nwamd`_start_crt+0x9a
>> nwamd`_start+0x1a
>> 27
>>
>> libc.so.1`__cp_tsc_read+0x19
>> libc.so.1`__cp_gethrtime+0x39
>> libc.so.1`__cp_clock_gettime_realtime+0x77
>> libc.so.1`__clock_gettime+0x72
>> libc.so.1`clock_gettime+0x26
>> nwamd`in_past+0x23
>> nwamd`nwamd_event_dequeue+0x1bf
>> nwamd`nwamd_event_handler+0x162
>> nwamd`main+0x1b0
>> nwamd`_start_crt+0x9a
>> nwamd`_start+0x1a
>> 403
>> # dtrace -n 'profile-101 /pid == $target/ { @[ustack()] = count() } tick-10s{exit(0)}' -p $(pgrep -x mariadbd) | tail -n 40
>> dtrace: description 'profile-101 ' matched 2 probes
>> mariadbd`_ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE+0xb8
>> mariadbd`_ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE+0x8a
>> mariadbd`_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x65
>> libstdc++.so.6.0.32`execute_native_thread_routine+0x10
>> libc.so.1`_thrp_setup+0x77
>> libc.so.1`_lwp_start
>> 126
>>
>> libc.so.1`__cp_tsc_read+0xf
>> libc.so.1`clock_gettime+0x15
>> libstdc++.so.6.0.32`_ZNSt6chrono3_V212steady_clock3nowEv+0x16
>> mariadbd`_ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE+0xb0
>> mariadbd`_ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE+0x8a
>> mariadbd`_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x65
>> libstdc++.so.6.0.32`execute_native_thread_routine+0x10
>> libc.so.1`_thrp_setup+0x77
>> libc.so.1`_lwp_start
>> 130
>>
>> libc.so.1`__cp_tsc_read+0xf
>> libc.so.1`clock_gettime+0x15
>> libstdc++.so.6.0.32`_ZNSt6chrono3_V212system_clock3nowEv+0x16
>> mariadbd`_ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE+0x103
>> mariadbd`_ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE+0x8a
>> mariadbd`_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x65
>> libstdc++.so.6.0.32`execute_native_thread_routine+0x10
>> libc.so.1`_thrp_setup+0x77
>> libc.so.1`_lwp_start
>> 135
>>
>> libc.so.1`__cp_tsc_read+0xf
>> libc.so.1`clock_gettime+0x15
>> libstdc++.so.6.0.32`_ZNSt6chrono3_V212steady_clock3nowEv+0x16
>> mariadbd`_ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE+0xa8
>> mariadbd`_ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE+0x8a
>> mariadbd`_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x65
>> libstdc++.so.6.0.32`execute_native_thread_routine+0x10
>> libc.so.1`_thrp_setup+0x77
>> libc.so.1`_lwp_start
>> 137
>>
>>
>> The obvious suspect is:
>>
>> commit 8b6b46dcb073dba71917d6a7309f0df7bad798a2
>> Author: Robert Mustacchi <rm@fingolfin.org>
>> Date: Tue Jul 23 14:44:22 2024 +0000
>>
>> 14237 Want support for pthread_cond_clockwait() and friends
>> Reviewed by: Andy Fiddaman <illumos@fiddaman.net>
>> Approved by: Gordon Ross <gordon.w.ross@gmail.com>
>>
>>
>> but I didn't bisect yet.
>
> Thanks for the report, Marcel. I will dig in and see what I can find.
> Apologies for the trouble.
I've root caused this and written it up in
https://www.illumos.org/issues/16683. The short form is that I
incorrectly handled how the default initializer set the internal clock
id in the cond_t. I have verified that nwam no longer is in a 100% loop
with the fix in place and confirmed that if I used static initializers
that my tests properly caught the issue before the fix and it is working
afterwards. I'll be sending a review request with additional regression
tests in a separate thread.
Again, I'm sorry for the trouble and inconvenience that this caused.
Thank you for reporting this Marcel.
Robert
next prev parent reply other threads:[~2024-07-28 17:50 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-28 5:56 Marcel Telka
2024-07-28 16:37 ` [developer] " Robert Mustacchi
2024-07-28 17:50 ` Robert Mustacchi [this message]
2024-07-28 23:16 ` Marcel Telka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e32abba4-e351-4907-986a-ed442749a49a@fingolfin.org \
--to=rm@fingolfin.org \
--cc=developer@lists.illumos.org \
--cc=marcel@telka.sk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).