mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Fix the return value of pthread_getschedparam in musl libc
@ 2020-05-20  2:09 tangyizhou
  2020-05-20 15:50 ` dalias
  0 siblings, 1 reply; 6+ messages in thread
From: tangyizhou @ 2020-05-20  2:09 UTC (permalink / raw)
  To: musl; +Cc: Wanghui (John), Huangshuai (OSLab), dalias

[-- Attachment #1: Type: text/plain, Size: 1222 bytes --]

Hi,

From the latest upstream, I find the return value of pthread_getschedparam() is wrong when syscall sched_getparam succeeds but sched_getscheduler fails. It is rare to see this situation. When it happens, pthread_getschedparam() will return r which must be 0, that means a success according to POSIX 2017 specification. It's wrong.

Also, POSIX 2017 specification says an error number shall be returned to indicate the error. I think returning -*policy is appropriate because syscall sched_getscheduler returns an negative error code in kernel like Linux when it fails.

Here is my patch. I want to be Cc'd on replies. Thanks.

diff --git a/src/thread/pthread_getschedparam.c b/src/thread/pthread_getschedparam.c
index 1cba073d..8203d609 100644
--- a/src/thread/pthread_getschedparam.c
+++ b/src/thread/pthread_getschedparam.c
@@ -12,6 +12,9 @@ int pthread_getschedparam(pthread_t t, int *restrict policy, struct sched_param
                if (!r) {
                        *policy = __syscall(SYS_sched_getscheduler, t->tid);
                }
+               if (*policy < 0) {
+                       r = -*policy;
+               }
        }
        UNLOCK(t->killlock);
        return r;


[-- Attachment #2: Type: text/html, Size: 6459 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Fix the return value of pthread_getschedparam in musl libc
  2020-05-20  2:09 [musl] Fix the return value of pthread_getschedparam in musl libc tangyizhou
@ 2020-05-20 15:50 ` dalias
  2020-05-22  1:53   ` tangyizhou
  0 siblings, 1 reply; 6+ messages in thread
From: dalias @ 2020-05-20 15:50 UTC (permalink / raw)
  To: tangyizhou; +Cc: musl, Wanghui (John), Huangshuai (OSLab)

On Wed, May 20, 2020 at 02:09:25AM +0000, tangyizhou wrote:
> Hi,
> 
> >From the latest upstream, I find the return value of
> >pthread_getschedparam() is wrong when syscall sched_getparam
> >succeeds but sched_getscheduler fails. It is rare to see this
> >situation. When it happens, pthread_getschedparam() will return r
> >which must be 0, that means a success according to POSIX 2017
> >specification. It's wrong.
> 
> Also, POSIX 2017 specification says an error number shall be
> returned to indicate the error. I think returning -*policy is
> appropriate because syscall sched_getscheduler returns an negative
> error code in kernel like Linux when it fails.

What is the situation under which SYS_sched_getparam can succeed but
SYS_sched_getscheduler can fail? The Linux man page documenting the
syscall:

http://man7.org/linux/man-pages/man2/sched_setscheduler.2.html

does not specify any errors meaningful for SYS_sched_getscheduler
other than ESRCH which is already precluded by the previous syscall
having succeeded.

> Here is my patch. I want to be Cc'd on replies. Thanks.

I've tried to CC @huawei.com addresses before but your mail system is
misconfigured to treat SPF neutral as fail. I don't know if this is
something you can get fixed, but getting it fixed would make it easier
to participate in open source projects. Trying again now anyway in
case it's already fixed.

> diff --git a/src/thread/pthread_getschedparam.c b/src/thread/pthread_getschedparam.c
> index 1cba073d..8203d609 100644
> --- a/src/thread/pthread_getschedparam.c
> +++ b/src/thread/pthread_getschedparam.c
> @@ -12,6 +12,9 @@ int pthread_getschedparam(pthread_t t, int *restrict policy, struct sched_param
>                 if (!r) {
>                         *policy = __syscall(SYS_sched_getscheduler, t->tid);
>                 }
> +               if (*policy < 0) {
> +                       r = -*policy;
> +               }
>         }
>         UNLOCK(t->killlock);
>         return r;

If there is a good reason to consider the possibility of
SYS_sched_getscheduler failing here, the patch has at least one bug
and at least one minor conformance issue. As written it examines
*policy even if nonzero r from SYS_sched_getparam failing caused
nothing to be written to *policy (in which case it's accessing
potentially-uninitialized memory, and clobbering the meaningful
error code r with a meaningless one from uninitialized memory). The
test would need to be inside the "if (!r) {" block to be valid.

Also (minor conformance/policy issue) if the function is going to fail
it should not write to *policy at all, but to a temp var, and only
copy to *policy on success.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [musl] Fix the return value of pthread_getschedparam in musl libc
  2020-05-20 15:50 ` dalias
@ 2020-05-22  1:53   ` tangyizhou
  2020-05-22  2:26     ` dalias
  0 siblings, 1 reply; 6+ messages in thread
From: tangyizhou @ 2020-05-22  1:53 UTC (permalink / raw)
  To: dalias; +Cc: musl, Wanghui (John), Huangshuai (OSLab)

> I've tried to CC @huawei.com addresses before but your mail system is misconfigured to treat SPF neutral as fail. I don't know if this is something you can get fixed, but getting it fixed would make it easier to 
> participate in open source projects. Trying again now anyway in case it's already fixed.

I feel sorry for the strict and complicated information security policy of our corporation. I received a mail from our mail system which says that sending to dalias@aerifal.cx is timed out, and I don't know why. 

> What is the situation under which SYS_sched_getparam can succeed but SYS_sched_getscheduler can fail? The Linux man page documenting the
> syscall:

There is a timing issue between two syscalls. Think about a concurrent scenario:
Thread A calls pthread_getschedparam() to query the information of thread B. After SYS_sched_getparam succeeds and before SYS_sched_getscheduler starts, thread B is killed, then SYS_sched_getparam returns -ESRCH.

> If there is a good reason to consider the possibility of SYS_sched_getscheduler failing here, the patch has at least one bug and at least one minor conformance issue. As written it examines *policy even if 
> nonzero r from SYS_sched_getparam failing caused nothing to be written to *policy (in which case it's accessing potentially-uninitialized memory, and clobbering the meaningful error code r with a meaningless 
> one from uninitialized memory). The test would need to be inside the "if (!r) {" block to be valid.

The test should be inside the "if (!r) {" block. You're quite right, thanks.

> Also (minor conformance/policy issue) if the function is going to fail it should not write to *policy at all, but to a temp var, and only copy to *policy on success.

The value of *policy is undefined when pthread_getschedparam() fails  according to POSIX 2017 specification:

https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_getschedparam.html

Only copying to *policy on success is a good programming practice. Here is my new patch:

diff --git a/src/thread/pthread_getschedparam.c b/src/thread/pthread_getschedparam.c
index 1cba073d..4922cfa8 100644
--- a/src/thread/pthread_getschedparam.c
+++ b/src/thread/pthread_getschedparam.c
@@ -10,7 +10,11 @@ int pthread_getschedparam(pthread_t t, int *restrict policy, struct sched_param
        } else {
                r = -__syscall(SYS_sched_getparam, t->tid, param);
                if (!r) {
-                       *policy = __syscall(SYS_sched_getscheduler, t->tid);
+                       r = -__syscall(SYS_sched_getscheduler, t->tid);
+                       if (r <= 0) {
+                               *policy = -r;
+                               r = 0;
+                       }
                }
        }
        UNLOCK(t->killlock);

Yizhou


-----Original Message-----
From: dalias@aerifal.cx [mailto:dalias@aerifal.cx] 
Sent: Wednesday, May 20, 2020 11:51 PM
To: tangyizhou <tangyizhou@huawei.com>
Cc: musl@lists.openwall.com; Wanghui (John) <john.wanghui@huawei.com>; Huangshuai (OSLab) <elvis.huang@huawei.com>
Subject: Re: [musl] Fix the return value of pthread_getschedparam in musl libc

On Wed, May 20, 2020 at 02:09:25AM +0000, tangyizhou wrote:
> Hi,
> 
> >From the latest upstream, I find the return value of
> >pthread_getschedparam() is wrong when syscall sched_getparam succeeds 
> >but sched_getscheduler fails. It is rare to see this situation. When 
> >it happens, pthread_getschedparam() will return r which must be 0, 
> >that means a success according to POSIX 2017 specification. It's 
> >wrong.
> 
> Also, POSIX 2017 specification says an error number shall be returned 
> to indicate the error. I think returning -*policy is appropriate 
> because syscall sched_getscheduler returns an negative error code in 
> kernel like Linux when it fails.

What is the situation under which SYS_sched_getparam can succeed but SYS_sched_getscheduler can fail? The Linux man page documenting the
syscall:

http://man7.org/linux/man-pages/man2/sched_setscheduler.2.html

does not specify any errors meaningful for SYS_sched_getscheduler other than ESRCH which is already precluded by the previous syscall having succeeded.

> Here is my patch. I want to be Cc'd on replies. Thanks.

I've tried to CC @huawei.com addresses before but your mail system is misconfigured to treat SPF neutral as fail. I don't know if this is something you can get fixed, but getting it fixed would make it easier to participate in open source projects. Trying again now anyway in case it's already fixed.

> diff --git a/src/thread/pthread_getschedparam.c 
> b/src/thread/pthread_getschedparam.c
> index 1cba073d..8203d609 100644
> --- a/src/thread/pthread_getschedparam.c
> +++ b/src/thread/pthread_getschedparam.c
> @@ -12,6 +12,9 @@ int pthread_getschedparam(pthread_t t, int *restrict policy, struct sched_param
>                 if (!r) {
>                         *policy = __syscall(SYS_sched_getscheduler, t->tid);
>                 }
> +               if (*policy < 0) {
> +                       r = -*policy;
> +               }
>         }
>         UNLOCK(t->killlock);
>         return r;

If there is a good reason to consider the possibility of SYS_sched_getscheduler failing here, the patch has at least one bug and at least one minor conformance issue. As written it examines *policy even if nonzero r from SYS_sched_getparam failing caused nothing to be written to *policy (in which case it's accessing potentially-uninitialized memory, and clobbering the meaningful error code r with a meaningless one from uninitialized memory). The test would need to be inside the "if (!r) {" block to be valid.

Also (minor conformance/policy issue) if the function is going to fail it should not write to *policy at all, but to a temp var, and only copy to *policy on success.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Fix the return value of pthread_getschedparam in musl libc
  2020-05-22  1:53   ` tangyizhou
@ 2020-05-22  2:26     ` dalias
  2020-05-28 14:27       ` tangyizhou
  0 siblings, 1 reply; 6+ messages in thread
From: dalias @ 2020-05-22  2:26 UTC (permalink / raw)
  To: tangyizhou; +Cc: musl, Wanghui (John), Huangshuai (OSLab)

On Fri, May 22, 2020 at 01:53:02AM +0000, tangyizhou wrote:
> > What is the situation under which SYS_sched_getparam can succeed
> > but SYS_sched_getscheduler can fail? The Linux man page
> > documenting the syscall:
> 
> There is a timing issue between two syscalls. Think about a concurrent scenario:
> Thread A calls pthread_getschedparam() to query the information of
> thread B.. After SYS_sched_getparam succeeds and before
> SYS_sched_getscheduler starts, thread B is killed, then
> SYS_sched_getparam returns -ESRCH.

There's not such an issue. t->killlock is held so that this can't
happen, and more importantly, so that the thread can't exit and the
tid be reassigned to a new thread or process that would wrongly be
acted upon.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [musl] Fix the return value of pthread_getschedparam in musl libc
  2020-05-22  2:26     ` dalias
@ 2020-05-28 14:27       ` tangyizhou
  2020-05-28 16:09         ` dalias
  0 siblings, 1 reply; 6+ messages in thread
From: tangyizhou @ 2020-05-28 14:27 UTC (permalink / raw)
  To: dalias; +Cc: musl, Wanghui (John), Huangshuai (OSLab)

> There's not such an issue. t->killlock is held so that this can't happen, and more importantly, so that the thread can't exit and the tid be reassigned to a new thread or process that would wrongly be acted upon.

Sorry for late reply.

t->killlock is held only in pthread functions, and it won't work in the following situation.  Assuming process A is running on CPU core 0, process B is running on CPU core 1, process C is running on CPU core 2. Process A calls pthread_getschedparam() to query the information of process B. After SYS_sched_getparam succeeds and before SYS_sched_getscheduler starts,  we assume the scheduling timeslice of A is running out, then A is put in the runqueue of the kernel. This is a chance for C to call kill() to kill B. When A is running again, SYS_sched_getparam returns -ESRCH.

Process B may be terminated due to other reasons when A is put in the runqueue. For example, B is running and encounters a bus error, then B is terminated because of SIGBUS signal.

It very hard to see these situations, but they exist in a theoretical way. There isn't such an issue for the implementation of pthread_getschedparam() of glibc.


Yizhou

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Fix the return value of pthread_getschedparam in musl libc
  2020-05-28 14:27       ` tangyizhou
@ 2020-05-28 16:09         ` dalias
  0 siblings, 0 replies; 6+ messages in thread
From: dalias @ 2020-05-28 16:09 UTC (permalink / raw)
  To: tangyizhou; +Cc: musl, Wanghui (John), Huangshuai (OSLab)

On Thu, May 28, 2020 at 02:27:55PM +0000, tangyizhou wrote:
> > There's not such an issue. t->killlock is held so that this can't
> > happen, and more importantly, so that the thread can't exit and
> > the tid be reassigned to a new thread or process that would
> > wrongly be acted upon.
> 
> Sorry for late reply.
> 
> t->killlock is held only in pthread functions, and it won't work in
> the following situation. Assuming process A is running on CPU core
> 0, process B is running on CPU core 1, process C is running on CPU
> core 2. Process A calls pthread_getschedparam() to query the
> information of process B.

This is not possible. pthread_getschedparam operates on threads not
processes. A pthread_t is only valid in the context of a process.
There is simply no way to pass a pthread_t for a thread in a different
process, because the identifiers are in a separate space. Two
pthread_t values could be numerically identical but refer to
completely different threads, or one of them be invalid, just because
they're local to the process -- and mechanically, the address space --
they're in.

> After SYS_sched_getparam succeeds and
> before SYS_sched_getscheduler starts, we assume the scheduling
> timeslice of A is running out, then A is put in the runqueue of the
> kernel. This is a chance for C to call kill() to kill B. When A is
> running again, SYS_sched_getparam returns -ESRCH.

You seem to be confusing threads and processes. kill signals processes
not threads. It's possible to send a signal to a particular thread;
there's a standard interface to do this within a process,
pthread_kill, and you could go outside the standard interfaces and do
it cross-process using kernel tids with tkill. But that does not cause
the thread to cease to exist. It makes a signal pending for the
thread, and depending on the action for that signal, it may either
cause a signal handler to run or cause *the whole process* to
terminate.

There is no way to forcibly terminate a single thread, from within the
same process or a different one, short of UB or using trace/debugging
type interfaces to attach to the process and do bad things to it.

> Process B may be terminated due to other reasons when A is put in
> the runqueue. For example, B is running and encounters a bus error,
> then B is terminated because of SIGBUS signal.

If SIGBUS is not caught, the whole *process* terminates, not the
thread.

> It very hard to see these situations, but they exist in a
> theoretical way. There isn't such an issue for the implementation of
> pthread_getschedparam() of glibc.

These are non-issues based on your misunderstanding of what threads
are.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-05-28 16:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-20  2:09 [musl] Fix the return value of pthread_getschedparam in musl libc tangyizhou
2020-05-20 15:50 ` dalias
2020-05-22  1:53   ` tangyizhou
2020-05-22  2:26     ` dalias
2020-05-28 14:27       ` tangyizhou
2020-05-28 16:09         ` dalias

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).