mailing list of musl libc
 help / color / mirror / code / Atom feed
* Subject: [PATCH] pthread: Fix bug that pthread_create may cause priority inversion
@ 2019-09-09 13:57 zhaohang (F)
  2019-09-09 14:01 ` 答复: " zhaohang (F)
  2019-09-09 14:54 ` Szabolcs Nagy
  0 siblings, 2 replies; 8+ messages in thread
From: zhaohang (F) @ 2019-09-09 13:57 UTC (permalink / raw)
  To: musl

Thanks for your reply. Maybe this patch may work by switching to having the new thread just wait for the parent to tell it whether priority setup succeeded.

Subject: [PATCH] pthread: Fix bug that pthread_create may cause priority
 inversion

--------

In pthread_create, caller wait for new thread to set its prio, and
then new thread wake caller up. It may cause priority inversion when
caller waits for new thread and the new thread waits for another thread
whose prio is lower than caller.

Just let caller set prio and affinity of new thread to fix it.

Signed-off-by: Zhao Hang <zhaohang14@huawei.com>
---
 src/thread/pthread_create.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/src/thread/pthread_create.c b/src/thread/pthread_create.c
index 7d4dc2e..ae08c0f 100644
--- a/src/thread/pthread_create.c
+++ b/src/thread/pthread_create.c
@@ -181,15 +181,8 @@ static int start(void *p)
 {
        struct start_args *args = p;
        if (args->attr) {
-               pthread_t self = __pthread_self();
-               int ret = -__syscall(SYS_sched_setscheduler, self->tid,
-                       args->attr->_a_policy, &args->attr->_a_prio);
-               if (a_swap(args->perr, ret)==-2)
-                       __wake(args->perr, 1, 1);
-               if (ret) {
-                       self->detach_state = DT_DETACHED;
-                       __pthread_exit(0);
-               }
+               if (a_cas(args->perr, -1, -2) == -1)
+                       __wait(args->perr, 0, -2, 1);
        }
        __syscall(SYS_rt_sigprocmask, SIG_SETMASK, &args->sig_mask, 0, _NSIG/8);
        __pthread_exit(args->start_func(args->start_arg));
@@ -367,10 +360,14 @@ int __pthread_create(pthread_t *restrict res, const pthread_attr_t *restrict att
        }

        if (attr._a_sched) {
-               if (a_cas(&err, -1, -2)==-1)
-                       __wait(&err, 0, -2, 1);
-               ret = err;
-               if (ret) return ret;
+               ret = -__syscall(SYS_sched_setscheduler, new->tid, attr._a_policy, &attr._a_prio);
+               if (ret) {
+                       new->detach_state = DT_DETACHED;
+                       pthread_cancel(new);
+                       return ret;
+               }
+               if (a_swap(&err, 0) == -2)
+                       __wake(&err, 1, 1);
        }

        *res = new;
--
2.7.4

-----邮件原件-----
发件人: Rich Felker [mailto:dalias@aerifal.cx] 代表 Rich Felker
发送时间: 2019年9月5日 21:34
收件人: zhaohang (F) <zhaohang14@huawei.com>
抄送: musl@lists.openwall.com
主题: Re: [musl] src/thread/pthread_create: Why prio of child thread is set by himself

On Thu, Sep 05, 2019 at 02:14:36AM +0000, zhaohang (F) wrote:
> In the function pthread_create, father thread will wait child if 
> attr._a_sched is set, after SYS_clone is finished.Child thread will 
> set his prio in entry 'start', and then wake father thread to 
> continue.
> 
> But consider this kind of situation, there are three threads: A with 
> prio 51, B with prio 30, and C with prio 20 created by A, and there is 
> only simplest sched policy 'FIFO'.
> 
> When system starts, A is running because A is higher than B, then A 
> uses pthread_create to create C. After C is cloned, A wait for C to 
> set prio and wake him up, but after C set his prio to 20, B will be 
> sched. And if B won't exit, A and C will never get sched, even if A is 
> higher than B. Maybe this is a kind of priority inversion.
> 
> So why prio of child is set by himself rather than father? If prio of 
> child is set by father, something will go wrong? Or other 
> considerations?

I think you're correct in your analysis of this problem; I'm going to look at it more in a bit to make sure.

Originally, pthread_create (in the caller) was responsible for setting priority; this changed in b8742f32602add243ee2ce74d804015463726899 and 40bae2d32fd6f3ffea437fa745ad38a1fe77b27e as part of trying to trim down the pthread structure and get init-time-only junk out of it.
However, 04335d9260c076cf4d9264bd93dd3b06c237a639 largely undid that already, and moved the extra start args to a struct on the new thread's stack so that it doesn't contribute to size/clutter in struct pthread. It should be easy to switch back to having the new thread just wait for the parent to tell it whether priority setup succeeded.

One related issue this also turned up is that exiting in detached state is probably a bad idea. Depending on priorities, the thread that failed to start could linger for a long time after pthread_create returns, potentially causing spurious transient resource exhaustion with no way to wait for it to subside. At some point we should probably switch from forcing detached exit to forcing joinable (or equivalent; forcing linking of pthread_join code is somewhat
undesirable) exit so that when a failed pthread_create returns it's not consuming any kernel task resources.

Thanks for the report.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-09-16  2:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09 13:57 Subject: [PATCH] pthread: Fix bug that pthread_create may cause priority inversion zhaohang (F)
2019-09-09 14:01 ` 答复: " zhaohang (F)
2019-09-09 14:54 ` Szabolcs Nagy
2019-09-09 17:49   ` Rich Felker
2019-09-11 13:38     ` 答复: [musl] " zhaohang (F)
2019-09-11 13:52       ` Rich Felker
2019-09-11 17:29         ` Rich Felker
2019-09-16  2:27           ` 答复: [musl] " zhaohang (F)

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).