* [musl] [pthread] pthread_barrier_wait invalid case @ 2021-12-16 15:25 zuotina 2021-12-16 18:16 ` Rich Felker 0 siblings, 1 reply; 5+ messages in thread From: zuotina @ 2021-12-16 15:25 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1727 bytes --] Hi everrone I encountered a panic problem when using timer_create recently. Although the probability is small, it still happened. Finaly I found there is a problem in the code of phtread_barrier_wait, and review code found that there may be problems in the following place, 81 a_store(&b->_b_lock, 0); 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); If scheduling occurs between lines 81 and 82, it will be not good. So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess ```c 81 a_store(&b->_b_lock, 0); /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, it can go through the entire function happily, that is, it will not be blocked */ syscall(yiled); // new add for test // When the dispatch comes back, this b has been released 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); ``` Here is an example of timer_create (src/time/timer_create.c) There are two threads A and B call pthread_barrier_wait. The call is as follows A thread: (timer_create // parent thread) { ..... // new add for test---begin while(b->_b_inst == NULL) { syscall(yield); } // new add for test---end pthread_barrier_wait(); } B thread: (start // child thread) { ..... // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait pthread_barrier_wait(); } In short, the reason for panic is that pthread_barrier_wait is not blocked as expected; I hope you help to confirm whether there is a problem with the implementation of pthread_barrier_wait or am I wrong? Looking forward to your reply. Thank you. [-- Attachment #2: Type: text/html, Size: 2764 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] [pthread] pthread_barrier_wait invalid case 2021-12-16 15:25 [musl] [pthread] pthread_barrier_wait invalid case zuotina @ 2021-12-16 18:16 ` Rich Felker 2021-12-17 14:28 ` [musl] " zuotina 0 siblings, 1 reply; 5+ messages in thread From: Rich Felker @ 2021-12-16 18:16 UTC (permalink / raw) To: zuotina; +Cc: musl On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote: > Hi everrone > > > I encountered a panic problem when using timer_create recently. > Although the probability is small, it still happened. > Finaly I found there is a problem in the code of phtread_barrier_wait, > and review code found that there may be problems in the following place, > 81 a_store(&b->_b_lock, 0); > 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); > If scheduling occurs between lines 81 and 82, it will be not good. > So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess > ```c > 81 a_store(&b->_b_lock, 0); > /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, > it can go through the entire function happily, that is, it will not be blocked */ > syscall(yiled); // new add for test > // When the dispatch comes back, this b has been released > 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); > ``` The intent here is that it's not possible that b has been released, because all waiters have to synchronize on b->_b_inst. It's possible there's a bug here. I'll look. What arch are you running on? > Here is an example of timer_create (src/time/timer_create.c) > There are two threads A and B call pthread_barrier_wait. > The call is as follows > A thread: (timer_create // parent thread) > { > ..... > // new add for test---begin > while(b->_b_inst == NULL) { > syscall(yield); > } > // new add for test---end > pthread_barrier_wait(); > } > B thread: (start // child thread) > { > ..... > // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait > pthread_barrier_wait(); > } > > > In short, the reason for panic is that pthread_barrier_wait is not blocked as expected; > I hope you help to confirm whether there is a problem with the implementation > of pthread_barrier_wait or am I wrong? > > > Looking forward to your reply. Thank you. Thanks for the report. Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
* [musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case 2021-12-16 18:16 ` Rich Felker @ 2021-12-17 14:28 ` zuotina 2022-01-19 14:56 ` [musl] " zuotina 0 siblings, 1 reply; 5+ messages in thread From: zuotina @ 2021-12-17 14:28 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 2283 bytes --] At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org> wrote: >On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote: >> Hi everrone >> >> >> I encountered a panic problem when using timer_create recently. >> Although the probability is small, it still happened. >> Finaly I found there is a problem in the code of phtread_barrier_wait, >> and review code found that there may be problems in the following place, >> 81 a_store(&b->_b_lock, 0); >> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); >> If scheduling occurs between lines 81 and 82, it will be not good. >> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess >> ```c >> 81 a_store(&b->_b_lock, 0); >> /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, >> it can go through the entire function happily, that is, it will not be blocked */ >> syscall(yiled); // new add for test >> // When the dispatch comes back, this b has been released >> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); >> ``` > >The intent here is that it's not possible that b has been released, >because all waiters have to synchronize on b->_b_inst. It's possible >there's a bug here. I'll look. What arch are you running on? running on aarch64. Looking forward to fix, thank you >> Here is an example of timer_create (src/time/timer_create.c) >> There are two threads A and B call pthread_barrier_wait. >> The call is as follows >> A thread: (timer_create // parent thread) >> { >> ..... >> // new add for test---begin >> while(b->_b_inst == NULL) { >> syscall(yield); >> } >> // new add for test---end >> pthread_barrier_wait(); >> } >> B thread: (start // child thread) >> { >> ..... >> // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait >> pthread_barrier_wait(); >> } >> >> >> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected; >> I hope you help to confirm whether there is a problem with the implementation >> of pthread_barrier_wait or am I wrong? >> >> >> Looking forward to your reply. Thank you. > >Thanks for the report. > >Rich [-- Attachment #2: Type: text/html, Size: 2888 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case 2021-12-17 14:28 ` [musl] " zuotina @ 2022-01-19 14:56 ` zuotina 2022-01-20 2:19 ` 答复: " zhaohang (F) 0 siblings, 1 reply; 5+ messages in thread From: zuotina @ 2022-01-19 14:56 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 3087 bytes --] Hi Team, Simple feedback on this issue First, replace pthread_barrier_wait in timer_create with a custom sync function (implemented by __wait, __wake), then the problem of panic is solved But I still think the best way is fixing pthread_barrier_wait. In addition, it is also the problem of the timer_create function. Continue to ask for advice. ```c timer_create: case SIGEV_THREAD: r = pthread_create(&td, &attr, start, &args); ... if (syscall(SYS_timer_create, clk, &ksev, &timerid) < 0) timerid = -1; ``` If this syscall fails, the 'start' thread will reside permanently, so the above only sets timerid = -1, which should not be perfect ? ```c start: for (;;) { while (sigwaitinfo(SIGTIMER_SET, &si) < 0); } ``` At 2021-12-17 22:28:14, "zuotina" <zuotingyang@126.com> wrote: At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org> wrote: >On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote: >> Hi everrone >> >> >> I encountered a panic problem when using timer_create recently. >> Although the probability is small, it still happened. >> Finaly I found there is a problem in the code of phtread_barrier_wait, >> and review code found that there may be problems in the following place, >> 81 a_store(&b->_b_lock, 0); >> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); >> If scheduling occurs between lines 81 and 82, it will be not good. >> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess >> ```c >> 81 a_store(&b->_b_lock, 0); >> /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, >> it can go through the entire function happily, that is, it will not be blocked */ >> syscall(yiled); // new add for test >> // When the dispatch comes back, this b has been released >> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); >> ``` > >The intent here is that it's not possible that b has been released, >because all waiters have to synchronize on b->_b_inst. It's possible >there's a bug here. I'll look. What arch are you running on? running on aarch64. Looking forward to fix, thank you >> Here is an example of timer_create (src/time/timer_create.c) >> There are two threads A and B call pthread_barrier_wait. >> The call is as follows >> A thread: (timer_create // parent thread) >> { >> ..... >> // new add for test---begin >> while(b->_b_inst == NULL) { >> syscall(yield); >> } >> // new add for test---end >> pthread_barrier_wait(); >> } >> B thread: (start // child thread) >> { >> ..... >> // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait >> pthread_barrier_wait(); >> } >> >> >> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected; >> I hope you help to confirm whether there is a problem with the implementation >> of pthread_barrier_wait or am I wrong? >> >> >> Looking forward to your reply. Thank you. > >Thanks for the report. > >Rich [-- Attachment #2: Type: text/html, Size: 4906 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* 答复: [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case 2022-01-19 14:56 ` [musl] " zuotina @ 2022-01-20 2:19 ` zhaohang (F) 0 siblings, 0 replies; 5+ messages in thread From: zhaohang (F) @ 2022-01-20 2:19 UTC (permalink / raw) To: musl; +Cc: zhangwentao (M) [-- Attachment #1: Type: text/plain, Size: 4060 bytes --] Maybe the following patch can solve this lacking issue diff --git a/src/time/timer_create.c b/src/time/timer_create.c index 0a29f05c2..dcd24fdcc 100644 --- a/src/time/timer_create.c +++ b/src/time/timer_create.c @@ -103,6 +103,10 @@ static void *start(void *arg) union sigval val = args->sev->sigev_value; __child_sync(&args->b); + + if (self->timer_id < 0) + return 0; + for (;;) { siginfo_t si; while (sigwaitinfo(SIGTIMER_SET, &si) < 0); 发件人: zuotina [mailto:zuotingyang@126.com] 发送时间: 2022年1月19日 22:56 收件人: musl@lists.openwall.com 主题: [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case Hi Team, Simple feedback on this issue First, replace pthread_barrier_wait in timer_create with a custom sync function (implemented by __wait, __wake), then the problem of panic is solved But I still think the best way is fixing pthread_barrier_wait. In addition, it is also the problem of the timer_create function. Continue to ask for advice. ```c timer_create: case SIGEV_THREAD: r = pthread_create(&td, &attr, start, &args); ... if (syscall(SYS_timer_create, clk, &ksev, &timerid) < 0) timerid = -1; ``` If this syscall fails, the 'start' thread will reside permanently, so the above only sets timerid = -1, which should not be perfect ? ```c start: for (;;) { while (sigwaitinfo(SIGTIMER_SET, &si) < 0); } ``` At 2021-12-17 22:28:14, "zuotina" <zuotingyang@126.com<mailto:zuotingyang@126.com>> wrote: At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org<mailto:dalias@libc.org>> wrote: >On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote: >> Hi everrone >> >> >> I encountered a panic problem when using timer_create recently. >> Although the probability is small, it still happened. >> Finaly I found there is a problem in the code of phtread_barrier_wait, >> and review code found that there may be problems in the following place, >> 81 a_store(&b->_b_lock, 0); >> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); >> If scheduling occurs between lines 81 and 82, it will be not good. >> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess >> ```c >> 81 a_store(&b->_b_lock, 0); >> /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, >> it can go through the entire function happily, that is, it will not be blocked */ >> syscall(yiled); // new add for test >> // When the dispatch comes back, this b has been released >> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1); >> ``` > >The intent here is that it's not possible that b has been released, >because all waiters have to synchronize on b->_b_inst. It's possible >there's a bug here. I'll look. What arch are you running on? running on aarch64. Looking forward to fix, thank you >> Here is an example of timer_create (src/time/timer_create.c) >> There are two threads A and B call pthread_barrier_wait. >> The call is as follows >> A thread: (timer_create // parent thread) >> { >> ..... >> // new add for test---begin >> while(b->_b_inst == NULL) { >> syscall(yield); >> } >> // new add for test---end >> pthread_barrier_wait(); >> } >> B thread: (start // child thread) >> { >> ..... >> // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait >> pthread_barrier_wait(); >> } >> >> >> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected; >> I hope you help to confirm whether there is a problem with the implementation >> of pthread_barrier_wait or am I wrong? >> >> >> Looking forward to your reply. Thank you. > >Thanks for the report. > >Rich [-- Attachment #2: Type: text/html, Size: 20227 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-01-20 2:20 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-12-16 15:25 [musl] [pthread] pthread_barrier_wait invalid case zuotina 2021-12-16 18:16 ` Rich Felker 2021-12-17 14:28 ` [musl] " zuotina 2022-01-19 14:56 ` [musl] " zuotina 2022-01-20 2:19 ` 答复: " zhaohang (F)
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).