* [musl] [pthread] pthread_barrier_wait invalid case
@ 2021-12-16 15:25 zuotina
2021-12-16 18:16 ` Rich Felker
0 siblings, 1 reply; 5+ messages in thread
From: zuotina @ 2021-12-16 15:25 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 1727 bytes --]
Hi everrone
I encountered a panic problem when using timer_create recently.
Although the probability is small, it still happened.
Finaly I found there is a problem in the code of phtread_barrier_wait,
and review code found that there may be problems in the following place,
81 a_store(&b->_b_lock, 0);
82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
If scheduling occurs between lines 81 and 82, it will be not good.
So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
```c
81 a_store(&b->_b_lock, 0);
/* If it is scheduled out here, when another thread executes pthread_barrier_wait again,
it can go through the entire function happily, that is, it will not be blocked */
syscall(yiled); // new add for test
// When the dispatch comes back, this b has been released
82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
```
Here is an example of timer_create (src/time/timer_create.c)
There are two threads A and B call pthread_barrier_wait.
The call is as follows
A thread: (timer_create // parent thread)
{
.....
// new add for test---begin
while(b->_b_inst == NULL) {
syscall(yield);
}
// new add for test---end
pthread_barrier_wait();
}
B thread: (start // child thread)
{
.....
// Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
pthread_barrier_wait();
}
In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
I hope you help to confirm whether there is a problem with the implementation
of pthread_barrier_wait or am I wrong?
Looking forward to your reply. Thank you.
[-- Attachment #2: Type: text/html, Size: 2764 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] [pthread] pthread_barrier_wait invalid case
2021-12-16 15:25 [musl] [pthread] pthread_barrier_wait invalid case zuotina
@ 2021-12-16 18:16 ` Rich Felker
2021-12-17 14:28 ` [musl] " zuotina
0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2021-12-16 18:16 UTC (permalink / raw)
To: zuotina; +Cc: musl
On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:
> Hi everrone
>
>
> I encountered a panic problem when using timer_create recently.
> Although the probability is small, it still happened.
> Finaly I found there is a problem in the code of phtread_barrier_wait,
> and review code found that there may be problems in the following place,
> 81 a_store(&b->_b_lock, 0);
> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
> If scheduling occurs between lines 81 and 82, it will be not good.
> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
> ```c
> 81 a_store(&b->_b_lock, 0);
> /* If it is scheduled out here, when another thread executes pthread_barrier_wait again,
> it can go through the entire function happily, that is, it will not be blocked */
> syscall(yiled); // new add for test
> // When the dispatch comes back, this b has been released
> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
> ```
The intent here is that it's not possible that b has been released,
because all waiters have to synchronize on b->_b_inst. It's possible
there's a bug here. I'll look. What arch are you running on?
> Here is an example of timer_create (src/time/timer_create.c)
> There are two threads A and B call pthread_barrier_wait.
> The call is as follows
> A thread: (timer_create // parent thread)
> {
> .....
> // new add for test---begin
> while(b->_b_inst == NULL) {
> syscall(yield);
> }
> // new add for test---end
> pthread_barrier_wait();
> }
> B thread: (start // child thread)
> {
> .....
> // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
> pthread_barrier_wait();
> }
>
>
> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
> I hope you help to confirm whether there is a problem with the implementation
> of pthread_barrier_wait or am I wrong?
>
>
> Looking forward to your reply. Thank you.
Thanks for the report.
Rich
^ permalink raw reply [flat|nested] 5+ messages in thread
* [musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case
2021-12-16 18:16 ` Rich Felker
@ 2021-12-17 14:28 ` zuotina
2022-01-19 14:56 ` [musl] " zuotina
0 siblings, 1 reply; 5+ messages in thread
From: zuotina @ 2021-12-17 14:28 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 2283 bytes --]
At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org> wrote:
>On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:
>> Hi everrone
>>
>>
>> I encountered a panic problem when using timer_create recently.
>> Although the probability is small, it still happened.
>> Finaly I found there is a problem in the code of phtread_barrier_wait,
>> and review code found that there may be problems in the following place,
>> 81 a_store(&b->_b_lock, 0);
>> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> If scheduling occurs between lines 81 and 82, it will be not good.
>> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
>> ```c
>> 81 a_store(&b->_b_lock, 0);
>> /* If it is scheduled out here, when another thread executes pthread_barrier_wait again,
>> it can go through the entire function happily, that is, it will not be blocked */
>> syscall(yiled); // new add for test
>> // When the dispatch comes back, this b has been released
>> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> ```
>
>The intent here is that it's not possible that b has been released,
>because all waiters have to synchronize on b->_b_inst. It's possible
>there's a bug here. I'll look. What arch are you running on?
running on aarch64.
Looking forward to fix, thank you
>> Here is an example of timer_create (src/time/timer_create.c)
>> There are two threads A and B call pthread_barrier_wait.
>> The call is as follows
>> A thread: (timer_create // parent thread)
>> {
>> .....
>> // new add for test---begin
>> while(b->_b_inst == NULL) {
>> syscall(yield);
>> }
>> // new add for test---end
>> pthread_barrier_wait();
>> }
>> B thread: (start // child thread)
>> {
>> .....
>> // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
>> pthread_barrier_wait();
>> }
>>
>>
>> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
>> I hope you help to confirm whether there is a problem with the implementation
>> of pthread_barrier_wait or am I wrong?
>>
>>
>> Looking forward to your reply. Thank you.
>
>Thanks for the report.
>
>Rich
[-- Attachment #2: Type: text/html, Size: 2888 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case
2021-12-17 14:28 ` [musl] " zuotina
@ 2022-01-19 14:56 ` zuotina
2022-01-20 2:19 ` 答复: " zhaohang (F)
0 siblings, 1 reply; 5+ messages in thread
From: zuotina @ 2022-01-19 14:56 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 3087 bytes --]
Hi Team,
Simple feedback on this issue
First, replace pthread_barrier_wait in timer_create with a custom sync function (implemented by __wait, __wake),
then the problem of panic is solved
But I still think the best way is fixing pthread_barrier_wait.
In addition, it is also the problem of the timer_create function. Continue to ask for advice.
```c
timer_create:
case SIGEV_THREAD:
r = pthread_create(&td, &attr, start, &args);
...
if (syscall(SYS_timer_create, clk, &ksev, &timerid) < 0)
timerid = -1;
```
If this syscall fails, the 'start' thread will reside permanently,
so the above only sets timerid = -1, which should not be perfect ?
```c
start:
for (;;) {
while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
}
```
At 2021-12-17 22:28:14, "zuotina" <zuotingyang@126.com> wrote:
At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org> wrote:
>On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:
>> Hi everrone
>>
>>
>> I encountered a panic problem when using timer_create recently.
>> Although the probability is small, it still happened.
>> Finaly I found there is a problem in the code of phtread_barrier_wait,
>> and review code found that there may be problems in the following place,
>> 81 a_store(&b->_b_lock, 0);
>> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> If scheduling occurs between lines 81 and 82, it will be not good.
>> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
>> ```c
>> 81 a_store(&b->_b_lock, 0);
>> /* If it is scheduled out here, when another thread executes pthread_barrier_wait again,
>> it can go through the entire function happily, that is, it will not be blocked */
>> syscall(yiled); // new add for test
>> // When the dispatch comes back, this b has been released
>> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> ```
>
>The intent here is that it's not possible that b has been released,
>because all waiters have to synchronize on b->_b_inst. It's possible
>there's a bug here. I'll look. What arch are you running on?
running on aarch64.
Looking forward to fix, thank you
>> Here is an example of timer_create (src/time/timer_create.c)
>> There are two threads A and B call pthread_barrier_wait.
>> The call is as follows
>> A thread: (timer_create // parent thread)
>> {
>> .....
>> // new add for test---begin
>> while(b->_b_inst == NULL) {
>> syscall(yield);
>> }
>> // new add for test---end
>> pthread_barrier_wait();
>> }
>> B thread: (start // child thread)
>> {
>> .....
>> // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
>> pthread_barrier_wait();
>> }
>>
>>
>> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
>> I hope you help to confirm whether there is a problem with the implementation
>> of pthread_barrier_wait or am I wrong?
>>
>>
>> Looking forward to your reply. Thank you.
>
>Thanks for the report.
>
>Rich
[-- Attachment #2: Type: text/html, Size: 4906 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* 答复: [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case
2022-01-19 14:56 ` [musl] " zuotina
@ 2022-01-20 2:19 ` zhaohang (F)
0 siblings, 0 replies; 5+ messages in thread
From: zhaohang (F) @ 2022-01-20 2:19 UTC (permalink / raw)
To: musl; +Cc: zhangwentao (M)
[-- Attachment #1: Type: text/plain, Size: 4060 bytes --]
Maybe the following patch can solve this lacking issue
diff --git a/src/time/timer_create.c b/src/time/timer_create.c
index 0a29f05c2..dcd24fdcc 100644
--- a/src/time/timer_create.c
+++ b/src/time/timer_create.c
@@ -103,6 +103,10 @@ static void *start(void *arg)
union sigval val = args->sev->sigev_value;
__child_sync(&args->b);
+
+ if (self->timer_id < 0)
+ return 0;
+
for (;;) {
siginfo_t si;
while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
发件人: zuotina [mailto:zuotingyang@126.com]
发送时间: 2022年1月19日 22:56
收件人: musl@lists.openwall.com
主题: [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case
Hi Team,
Simple feedback on this issue
First, replace pthread_barrier_wait in timer_create with a custom sync function (implemented by __wait, __wake),
then the problem of panic is solved
But I still think the best way is fixing pthread_barrier_wait.
In addition, it is also the problem of the timer_create function. Continue to ask for advice.
```c
timer_create:
case SIGEV_THREAD:
r = pthread_create(&td, &attr, start, &args);
...
if (syscall(SYS_timer_create, clk, &ksev, &timerid) < 0)
timerid = -1;
```
If this syscall fails, the 'start' thread will reside permanently,
so the above only sets timerid = -1, which should not be perfect ?
```c
start:
for (;;) {
while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
}
```
At 2021-12-17 22:28:14, "zuotina" <zuotingyang@126.com<mailto:zuotingyang@126.com>> wrote:
At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org<mailto:dalias@libc.org>> wrote:
>On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:
>> Hi everrone
>>
>>
>> I encountered a panic problem when using timer_create recently.
>> Although the probability is small, it still happened.
>> Finaly I found there is a problem in the code of phtread_barrier_wait,
>> and review code found that there may be problems in the following place,
>> 81 a_store(&b->_b_lock, 0);
>> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> If scheduling occurs between lines 81 and 82, it will be not good.
>> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
>> ```c
>> 81 a_store(&b->_b_lock, 0);
>> /* If it is scheduled out here, when another thread executes pthread_barrier_wait again,
>> it can go through the entire function happily, that is, it will not be blocked */
>> syscall(yiled); // new add for test
>> // When the dispatch comes back, this b has been released
>> 82 if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> ```
>
>The intent here is that it's not possible that b has been released,
>because all waiters have to synchronize on b->_b_inst. It's possible
>there's a bug here. I'll look. What arch are you running on?
running on aarch64.
Looking forward to fix, thank you
>> Here is an example of timer_create (src/time/timer_create.c)
>> There are two threads A and B call pthread_barrier_wait.
>> The call is as follows
>> A thread: (timer_create // parent thread)
>> {
>> .....
>> // new add for test---begin
>> while(b->_b_inst == NULL) {
>> syscall(yield);
>> }
>> // new add for test---end
>> pthread_barrier_wait();
>> }
>> B thread: (start // child thread)
>> {
>> .....
>> // Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
>> pthread_barrier_wait();
>> }
>>
>>
>> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
>> I hope you help to confirm whether there is a problem with the implementation
>> of pthread_barrier_wait or am I wrong?
>>
>>
>> Looking forward to your reply. Thank you.
>
>Thanks for the report.
>
>Rich
[-- Attachment #2: Type: text/html, Size: 20227 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-01-20 2:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-16 15:25 [musl] [pthread] pthread_barrier_wait invalid case zuotina
2021-12-16 18:16 ` Rich Felker
2021-12-17 14:28 ` [musl] " zuotina
2022-01-19 14:56 ` [musl] " zuotina
2022-01-20 2:19 ` 答复: " zhaohang (F)
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).