mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [pthread] pthread_barrier_wait  invalid case
@ 2021-12-16 15:25 zuotina
  2021-12-16 18:16 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: zuotina @ 2021-12-16 15:25 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1727 bytes --]

Hi everrone


I encountered a panic problem when using timer_create recently.
Although the probability is small, it still happened.
Finaly I found there is a problem in the code of phtread_barrier_wait, 
and review code found that there may be problems in the following place, 
81  a_store(&b->_b_lock, 0);
82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
If scheduling occurs between lines 81 and 82, it will be not good.
So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
```c
81  a_store(&b->_b_lock, 0);
                 /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, 
                    it can go through the entire function happily, that is, it will not be blocked */
      syscall(yiled); // new add for test
               // When the dispatch comes back, this b has been released
82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
```
Here is an example of timer_create (src/time/timer_create.c)
There are two threads A and B call pthread_barrier_wait. 
The call is as follows
A thread: (timer_create // parent thread)
{
       .....
      // new add for test---begin
       while(b->_b_inst == NULL) {
                syscall(yield);
       }
     // new add for test---end
     pthread_barrier_wait();
}
B thread: (start // child thread)
{
       .....
      //  Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
      pthread_barrier_wait();
}


In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
I hope you help to confirm whether there is a problem with the implementation 
of pthread_barrier_wait or am I wrong?


Looking forward to your reply. Thank you. 

[-- Attachment #2: Type: text/html, Size: 2764 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] [pthread] pthread_barrier_wait  invalid case
  2021-12-16 15:25 [musl] [pthread] pthread_barrier_wait invalid case zuotina
@ 2021-12-16 18:16 ` Rich Felker
  2021-12-17 14:28   ` [musl] " zuotina
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2021-12-16 18:16 UTC (permalink / raw)
  To: zuotina; +Cc: musl

On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:
> Hi everrone
> 
> 
> I encountered a panic problem when using timer_create recently.
> Although the probability is small, it still happened.
> Finaly I found there is a problem in the code of phtread_barrier_wait, 
> and review code found that there may be problems in the following place, 
> 81  a_store(&b->_b_lock, 0);
> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
> If scheduling occurs between lines 81 and 82, it will be not good.
> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
> ```c
> 81  a_store(&b->_b_lock, 0);
>                  /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, 
>                     it can go through the entire function happily, that is, it will not be blocked */
>       syscall(yiled); // new add for test
>                // When the dispatch comes back, this b has been released
> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
> ```

The intent here is that it's not possible that b has been released,
because all waiters have to synchronize on b->_b_inst. It's possible
there's a bug here. I'll look. What arch are you running on?

> Here is an example of timer_create (src/time/timer_create.c)
> There are two threads A and B call pthread_barrier_wait. 
> The call is as follows
> A thread: (timer_create // parent thread)
> {
>        .....
>       // new add for test---begin
>        while(b->_b_inst == NULL) {
>                 syscall(yield);
>        }
>      // new add for test---end
>      pthread_barrier_wait();
> }
> B thread: (start // child thread)
> {
>        .....
>       //  Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
>       pthread_barrier_wait();
> }
> 
> 
> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
> I hope you help to confirm whether there is a problem with the implementation 
> of pthread_barrier_wait or am I wrong?
> 
> 
> Looking forward to your reply. Thank you. 

Thanks for the report.

Rich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [musl] Re:Re: [musl] [pthread] pthread_barrier_wait  invalid case
  2021-12-16 18:16 ` Rich Felker
@ 2021-12-17 14:28   ` zuotina
  2022-01-19 14:56     ` [musl] " zuotina
  0 siblings, 1 reply; 5+ messages in thread
From: zuotina @ 2021-12-17 14:28 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2283 bytes --]

At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org> wrote:

>On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:
>> Hi everrone
>> 
>> 
>> I encountered a panic problem when using timer_create recently.
>> Although the probability is small, it still happened.
>> Finaly I found there is a problem in the code of phtread_barrier_wait, 
>> and review code found that there may be problems in the following place, 
>> 81  a_store(&b->_b_lock, 0);
>> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> If scheduling occurs between lines 81 and 82, it will be not good.
>> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
>> ```c
>> 81  a_store(&b->_b_lock, 0);
>>                  /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, 
>>                     it can go through the entire function happily, that is, it will not be blocked */
>>       syscall(yiled); // new add for test
>>                // When the dispatch comes back, this b has been released
>> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> ```
>
>The intent here is that it's not possible that b has been released,
>because all waiters have to synchronize on b->_b_inst. It's possible
>there's a bug here. I'll look. What arch are you running on?

 running on aarch64. 
 Looking forward to fix, thank you
>> Here is an example of timer_create (src/time/timer_create.c)
>> There are two threads A and B call pthread_barrier_wait. 
>> The call is as follows
>> A thread: (timer_create // parent thread)
>> {
>>        .....
>>       // new add for test---begin
>>        while(b->_b_inst == NULL) {
>>                 syscall(yield);
>>        }
>>      // new add for test---end
>>      pthread_barrier_wait();
>> }
>> B thread: (start // child thread)
>> {
>>        .....
>>       //  Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
>>       pthread_barrier_wait();
>> }
>> 
>> 
>> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
>> I hope you help to confirm whether there is a problem with the implementation 
>> of pthread_barrier_wait or am I wrong?
>> 
>> 
>> Looking forward to your reply. Thank you. 
>
>Thanks for the report.
>
>Rich

[-- Attachment #2: Type: text/html, Size: 2888 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait  invalid case
  2021-12-17 14:28   ` [musl] " zuotina
@ 2022-01-19 14:56     ` zuotina
  2022-01-20  2:19       ` 答复: " zhaohang (F)
  0 siblings, 1 reply; 5+ messages in thread
From: zuotina @ 2022-01-19 14:56 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 3087 bytes --]



Hi Team,
Simple feedback on this issue
First,  replace pthread_barrier_wait in timer_create with a custom sync function (implemented by __wait, __wake),
then the problem of panic is solved
But I still think the best way is fixing pthread_barrier_wait. 


In addition, it is also the problem of the timer_create function. Continue to ask for advice.
```c
timer_create:
case SIGEV_THREAD:
r = pthread_create(&td, &attr, start, &args);
    ...
if (syscall(SYS_timer_create, clk, &ksev, &timerid) < 0)
timerid = -1;
```
If this syscall fails, the 'start' thread will reside permanently, 
so the above only sets timerid = -1, which should not be perfect ?
```c
start:
for (;;) {
while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
}

```







At 2021-12-17 22:28:14, "zuotina" <zuotingyang@126.com> wrote:

At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org> wrote:

>On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:
>> Hi everrone
>> 
>> 
>> I encountered a panic problem when using timer_create recently.
>> Although the probability is small, it still happened.
>> Finaly I found there is a problem in the code of phtread_barrier_wait, 
>> and review code found that there may be problems in the following place, 
>> 81  a_store(&b->_b_lock, 0);
>> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> If scheduling occurs between lines 81 and 82, it will be not good.
>> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess
>> ```c
>> 81  a_store(&b->_b_lock, 0);
>>                  /* If it is scheduled out here, when another thread executes pthread_barrier_wait again, 
>>                     it can go through the entire function happily, that is, it will not be blocked */
>>       syscall(yiled); // new add for test
>>                // When the dispatch comes back, this b has been released
>> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);
>> ```
>
>The intent here is that it's not possible that b has been released,
>because all waiters have to synchronize on b->_b_inst. It's possible
>there's a bug here. I'll look. What arch are you running on?

 running on aarch64. 
 Looking forward to fix, thank you
>> Here is an example of timer_create (src/time/timer_create.c)
>> There are two threads A and B call pthread_barrier_wait. 
>> The call is as follows
>> A thread: (timer_create // parent thread)
>> {
>>        .....
>>       // new add for test---begin
>>        while(b->_b_inst == NULL) {
>>                 syscall(yield);
>>        }
>>      // new add for test---end
>>      pthread_barrier_wait();
>> }
>> B thread: (start // child thread)
>> {
>>        .....
>>       //  Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait
>>       pthread_barrier_wait();
>> }
>> 
>> 
>> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;
>> I hope you help to confirm whether there is a problem with the implementation 
>> of pthread_barrier_wait or am I wrong?
>> 
>> 
>> Looking forward to your reply. Thank you. 
>
>Thanks for the report.
>
>Rich





 

[-- Attachment #2: Type: text/html, Size: 4906 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* 答复: [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait  invalid case
  2022-01-19 14:56     ` [musl] " zuotina
@ 2022-01-20  2:19       ` zhaohang (F)
  0 siblings, 0 replies; 5+ messages in thread
From: zhaohang (F) @ 2022-01-20  2:19 UTC (permalink / raw)
  To: musl; +Cc: zhangwentao (M)

[-- Attachment #1: Type: text/plain, Size: 4060 bytes --]

Maybe the following patch can solve this lacking issue

diff --git a/src/time/timer_create.c b/src/time/timer_create.c
index 0a29f05c2..dcd24fdcc 100644
--- a/src/time/timer_create.c
+++ b/src/time/timer_create.c
@@ -103,6 +103,10 @@ static void *start(void *arg)
        union sigval val = args->sev->sigev_value;

        __child_sync(&args->b);
+
+       if (self->timer_id < 0)
+               return 0;
+
        for (;;) {
                siginfo_t si;
                while (sigwaitinfo(SIGTIMER_SET, &si) < 0);

发件人: zuotina [mailto:zuotingyang@126.com]
发送时间: 2022年1月19日 22:56
收件人: musl@lists.openwall.com
主题: [musl] Re:[musl] Re:Re: [musl] [pthread] pthread_barrier_wait invalid case


Hi Team,
Simple feedback on this issue
First,  replace pthread_barrier_wait in timer_create with a custom sync function (implemented by __wait, __wake),
then the problem of panic is solved
But I still think the best way is fixing pthread_barrier_wait.

In addition, it is also the problem of the timer_create function. Continue to ask for advice.
```c
timer_create:
case SIGEV_THREAD:
r = pthread_create(&td, &attr, start, &args);
    ...
if (syscall(SYS_timer_create, clk, &ksev, &timerid) < 0)
timerid = -1;
```
If this syscall fails, the 'start' thread will reside permanently,
so the above only sets timerid = -1, which should not be perfect ?
```c
start:
for (;;) {
while (sigwaitinfo(SIGTIMER_SET, &si) < 0);
}

```





At 2021-12-17 22:28:14, "zuotina" <zuotingyang@126.com<mailto:zuotingyang@126.com>> wrote:

At 2021-12-17 02:16:07, "Rich Felker" <dalias@libc.org<mailto:dalias@libc.org>> wrote:

>On Thu, Dec 16, 2021 at 11:25:35PM +0800, zuotina wrote:

>> Hi everrone

>>

>>

>> I encountered a panic problem when using timer_create recently.

>> Although the probability is small, it still happened.

>> Finaly I found there is a problem in the code of phtread_barrier_wait,

>> and review code found that there may be problems in the following place,

>> 81  a_store(&b->_b_lock, 0);

>> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);

>> If scheduling occurs between lines 81 and 82, it will be not good.

>> So I did an experiment and modified the source code of pthread_barrier_wait to verify my guess

>> ```c

>> 81  a_store(&b->_b_lock, 0);

>>                  /* If it is scheduled out here, when another thread executes pthread_barrier_wait again,

>>                     it can go through the entire function happily, that is, it will not be blocked */

>>       syscall(yiled); // new add for test

>>                // When the dispatch comes back, this b has been released

>> 82  if (b->_b_waiters) __wake(&b->_b_lock, 1, 1);

>> ```

>

>The intent here is that it's not possible that b has been released,

>because all waiters have to synchronize on b->_b_inst. It's possible

>there's a bug here. I'll look. What arch are you running on?

 running on aarch64.

 Looking forward to fix, thank you

>> Here is an example of timer_create (src/time/timer_create.c)

>> There are two threads A and B call pthread_barrier_wait.

>> The call is as follows

>> A thread: (timer_create // parent thread)

>> {

>>        .....

>>       // new add for test---begin

>>        while(b->_b_inst == NULL) {

>>                 syscall(yield);

>>        }

>>      // new add for test---end

>>      pthread_barrier_wait();

>> }

>> B thread: (start // child thread)

>> {

>>        .....

>>       //  Ensure that this function is advanced to the if (!inst) {} branch of barrier_wait

>>       pthread_barrier_wait();

>> }

>>

>>

>> In short, the reason for panic is that pthread_barrier_wait is not blocked as expected;

>> I hope you help to confirm whether there is a problem with the implementation

>> of pthread_barrier_wait or am I wrong?

>>

>>

>> Looking forward to your reply. Thank you.

>

>Thanks for the report.

>

>Rich







[-- Attachment #2: Type: text/html, Size: 20227 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-20  2:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-16 15:25 [musl] [pthread] pthread_barrier_wait invalid case zuotina
2021-12-16 18:16 ` Rich Felker
2021-12-17 14:28   ` [musl] " zuotina
2022-01-19 14:56     ` [musl] " zuotina
2022-01-20  2:19       ` 答复: " zhaohang (F)

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).