mailing list of musl libc
 help / color / mirror / code / Atom feed
* possible bug in setjmp implementation for ppc64
@ 2017-07-31 20:06 felix.winkelmann
  2017-07-31 20:30 ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: felix.winkelmann @ 2017-07-31 20:06 UTC (permalink / raw)
  To: musl; +Cc: peter

Hi!

I think I may have come across a bug in musl on PPC64(le), and the folks
on the #musl IRC channel directed me here. I'm not totally sure whether
the problem is caused by a my misunderstanding of C library functions or whether
it is a plain bug in the musl implementation of setjmp(3).

In out project[1] we use setjmp to establish a global trampoline
and allocate small objects on the stack using alloca (see [2] for
more information about the compiliation strategy used). I was able to reduce
the code that crashes to the following:

---
#include <stdio.h>
#include <alloca.h>
#include <setjmp.h>
#include <string.h>
#include <stdlib.h>

jmp_buf jb;

int foo = 99;
int c = 0;

void bar()
{
  c++;
  longjmp(jb, 1);
}

int main()
{
  setjmp(jb);
  char *p = alloca(256);
  memset(p, 0, 256);
  printf("%d\n", foo);

  if(c < 10) bar();

  exit(0);
}
---

When executing the longjmp, the code that restores $r2 (TOC) after the call
to setjmp reads invalid data, because the memset apparently clobbered
the stack frame - i.e. the pointer returned be alloca points into a part
of the stack frame that is still in use.

I tried this on arm, x86_64 and ppc64 with glibc and it seems to work fine,
but crashes when linked with musl (running Alpine Linux on a VM)

If you need more information, please feel free to ask. You can also keep
me CC'd, since I'd be interested in knowing more about the details.


felix

[1] http://www.call-cc.org
[2] http://home.pipeline.com/~hbaker1/CheneyMTA.html



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-07-31 20:06 possible bug in setjmp implementation for ppc64 felix.winkelmann
@ 2017-07-31 20:30 ` Rich Felker
  2017-08-01  5:10   ` Bobby Bingham
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2017-07-31 20:30 UTC (permalink / raw)
  To: musl

On Mon, Jul 31, 2017 at 10:06:51PM +0200, felix.winkelmann@bevuta.com wrote:
> Hi!
> 
> I think I may have come across a bug in musl on PPC64(le), and the folks
> on the #musl IRC channel directed me here. I'm not totally sure whether
> the problem is caused by a my misunderstanding of C library functions or whether
> it is a plain bug in the musl implementation of setjmp(3).
> 
> In out project[1] we use setjmp to establish a global trampoline
> and allocate small objects on the stack using alloca (see [2] for
> more information about the compiliation strategy used). I was able to reduce
> the code that crashes to the following:
> 
> ---
> #include <stdio.h>
> #include <alloca.h>
> #include <setjmp.h>
> #include <string.h>
> #include <stdlib.h>
> 
> jmp_buf jb;
> 
> int foo = 99;
> int c = 0;
> 
> void bar()
> {
>   c++;
>   longjmp(jb, 1);
> }
> 
> int main()
> {
>   setjmp(jb);
>   char *p = alloca(256);
>   memset(p, 0, 256);
>   printf("%d\n", foo);
> 
>   if(c < 10) bar();
> 
>   exit(0);
> }
> ---
> 
> When executing the longjmp, the code that restores $r2 (TOC) after the call
> to setjmp reads invalid data, because the memset apparently clobbered
> the stack frame - i.e. the pointer returned be alloca points into a part
> of the stack frame that is still in use.
> 
> I tried this on arm, x86_64 and ppc64 with glibc and it seems to work fine,
> but crashes when linked with musl (running Alpine Linux on a VM)
> 
> If you need more information, please feel free to ask. You can also keep
> me CC'd, since I'd be interested in knowing more about the details.

It looks to me like we have a bug here, but it's one where I or
someone else needs to read and understand the PPC64 ELFv2 ABI document
to fully understand what's going on and make a fix. I'll try to get to
it soon, or I'm happy if someone else wants to. I don't just want to
cargo-cult whatever glibc is doing, though; a fix should be
accompanied by an understanding of why it's right.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-07-31 20:30 ` Rich Felker
@ 2017-08-01  5:10   ` Bobby Bingham
  2017-08-01  5:28     ` Alexander Monakov
  2017-08-01 15:33     ` David Edelsohn
  0 siblings, 2 replies; 16+ messages in thread
From: Bobby Bingham @ 2017-08-01  5:10 UTC (permalink / raw)
  To: musl

On Mon, Jul 31, 2017 at 04:30:07PM -0400, Rich Felker wrote:
> On Mon, Jul 31, 2017 at 10:06:51PM +0200, felix.winkelmann@bevuta.com wrote:
> > Hi!
> >
> > I think I may have come across a bug in musl on PPC64(le), and the folks
> > on the #musl IRC channel directed me here. I'm not totally sure whether
> > the problem is caused by a my misunderstanding of C library functions or whether
> > it is a plain bug in the musl implementation of setjmp(3).
> >
> > In out project[1] we use setjmp to establish a global trampoline
> > and allocate small objects on the stack using alloca (see [2] for
> > more information about the compiliation strategy used). I was able to reduce
> > the code that crashes to the following:
> >
> > ---
> > #include <stdio.h>
> > #include <alloca.h>
> > #include <setjmp.h>
> > #include <string.h>
> > #include <stdlib.h>
> >
> > jmp_buf jb;
> >
> > int foo = 99;
> > int c = 0;
> >
> > void bar()
> > {
> >   c++;
> >   longjmp(jb, 1);
> > }
> >
> > int main()
> > {
> >   setjmp(jb);
> >   char *p = alloca(256);
> >   memset(p, 0, 256);
> >   printf("%d\n", foo);
> >
> >   if(c < 10) bar();
> >
> >   exit(0);
> > }
> > ---
> >
> > When executing the longjmp, the code that restores $r2 (TOC) after the call
> > to setjmp reads invalid data, because the memset apparently clobbered
> > the stack frame - i.e. the pointer returned be alloca points into a part
> > of the stack frame that is still in use.
> >
> > I tried this on arm, x86_64 and ppc64 with glibc and it seems to work fine,
> > but crashes when linked with musl (running Alpine Linux on a VM)
> >
> > If you need more information, please feel free to ask. You can also keep
> > me CC'd, since I'd be interested in knowing more about the details.
>
> It looks to me like we have a bug here, but it's one where I or
> someone else needs to read and understand the PPC64 ELFv2 ABI document
> to fully understand what's going on and make a fix. I'll try to get to
> it soon, or I'm happy if someone else wants to. I don't just want to
> cargo-cult whatever glibc is doing, though; a fix should be
> accompanied by an understanding of why it's right.

I think I can explain what's happening.

The TOC pointer is constant within a given dynamic module (the main
executable or a library), but needs to be adjusted at cross-module
calls.  Each function has two entry points in the ELFv2 ABI.  The entry
point for intra-module calls can assume r2 is already set up correctly.
The entry point for inter-module calls starts two instructions earlier
and adjusts r2 before falling through to the intra-module entry point.

Normally, r2 is supposed to be preserved across calls.  For intra-module
calls, there's no problem.  For inter-module calls, the PLT stub saves
the caller's r2 value to a slot in the caller's stack frame that's
required to be reserved for it, at r1+24.  The linker then inserts code
in the caller to restore the value from the stack immediately after the
call.

So what's happening here is that the value of r2 that setjmp saves and
that longjmp restores is the TOC pointer for libc, as set up by the PLT
stub.  It's not the value of r2 that the caller had.  But that's
normally fine -- after the second return from setjmp, the caller will
restore its TOC pointer from the stack where it had been saved by the
PLT stub when it originally called setjmp.  But in this example, gcc
decides to allocate the 256 bytes overtop the part of the stack where
the setjmp PLT stub had saved the TOC pointer, so it gets clobbered.

The problem is that static linking and dynamic linking need to work
differently.  With dynamic linking, we can fix this by changing setjmp
to read the caller's TOC pointer from the reserved slot in the caller's
stack frame, and longjmp to restore it to the stack instead of to r2.

But with static linking, there's no PLT stub or code added by the linker
to restore the TOC pointer from the stack, so we need to save/restore
from/to r2, not the TOC slot in the caller's stack from.

I think this either requires having different versions of setjmp/longjmp
for static and dynamic libc, or to increase the size of jmpbuf so we can
always save/restore both r2 and the value on the stack, but this would
be an ABI change.

--
Bobby


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-01  5:10   ` Bobby Bingham
@ 2017-08-01  5:28     ` Alexander Monakov
  2017-08-01 22:45       ` Rich Felker
  2017-08-01 15:33     ` David Edelsohn
  1 sibling, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-08-01  5:28 UTC (permalink / raw)
  To: musl

On Tue, 1 Aug 2017, Bobby Bingham wrote:
> I think this either requires having different versions of setjmp/longjmp
> for static and dynamic libc,

Do you mean for non-pic vs pic objects? As I understand, when libc.a is
built with -fpic (so it's suitable for static-pie), setjmp-longjmp need
to preserve saved TOC at (r1+24). So presumably source code would need
to test #ifdef __PIC__?

> or to increase the size of jmpbuf so we can always save/restore both
> r2 and the value on the stack, but this would be an ABI change.

Would that work for non-pic, i.e. is (r1+24) a reserved location even in
non-pic mode? If not, you can't overwrite it from longjmp.

Alexander


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-01  5:10   ` Bobby Bingham
  2017-08-01  5:28     ` Alexander Monakov
@ 2017-08-01 15:33     ` David Edelsohn
  2017-08-02 23:00       ` Alexander Monakov
  1 sibling, 1 reply; 16+ messages in thread
From: David Edelsohn @ 2017-08-01 15:33 UTC (permalink / raw)
  To: musl

On Tue, Aug 1, 2017 at 1:10 AM, Bobby Bingham <koorogi@koorogi.info> wrote:
> On Mon, Jul 31, 2017 at 04:30:07PM -0400, Rich Felker wrote:
>> On Mon, Jul 31, 2017 at 10:06:51PM +0200, felix.winkelmann@bevuta.com wrote:
>> > Hi!
>> >
>> > I think I may have come across a bug in musl on PPC64(le), and the folks
>> > on the #musl IRC channel directed me here. I'm not totally sure whether
>> > the problem is caused by a my misunderstanding of C library functions or whether
>> > it is a plain bug in the musl implementation of setjmp(3).
>> >
>> > In out project[1] we use setjmp to establish a global trampoline
>> > and allocate small objects on the stack using alloca (see [2] for
>> > more information about the compiliation strategy used). I was able to reduce
>> > the code that crashes to the following:
>> >
>> > ---
>> > #include <stdio.h>
>> > #include <alloca.h>
>> > #include <setjmp.h>
>> > #include <string.h>
>> > #include <stdlib.h>
>> >
>> > jmp_buf jb;
>> >
>> > int foo = 99;
>> > int c = 0;
>> >
>> > void bar()
>> > {
>> >   c++;
>> >   longjmp(jb, 1);
>> > }
>> >
>> > int main()
>> > {
>> >   setjmp(jb);
>> >   char *p = alloca(256);
>> >   memset(p, 0, 256);
>> >   printf("%d\n", foo);
>> >
>> >   if(c < 10) bar();
>> >
>> >   exit(0);
>> > }
>> > ---
>> >
>> > When executing the longjmp, the code that restores $r2 (TOC) after the call
>> > to setjmp reads invalid data, because the memset apparently clobbered
>> > the stack frame - i.e. the pointer returned be alloca points into a part
>> > of the stack frame that is still in use.
>> >
>> > I tried this on arm, x86_64 and ppc64 with glibc and it seems to work fine,
>> > but crashes when linked with musl (running Alpine Linux on a VM)
>> >
>> > If you need more information, please feel free to ask. You can also keep
>> > me CC'd, since I'd be interested in knowing more about the details.
>>
>> It looks to me like we have a bug here, but it's one where I or
>> someone else needs to read and understand the PPC64 ELFv2 ABI document
>> to fully understand what's going on and make a fix. I'll try to get to
>> it soon, or I'm happy if someone else wants to. I don't just want to
>> cargo-cult whatever glibc is doing, though; a fix should be
>> accompanied by an understanding of why it's right.
>
> I think I can explain what's happening.
>
> The TOC pointer is constant within a given dynamic module (the main
> executable or a library), but needs to be adjusted at cross-module
> calls.  Each function has two entry points in the ELFv2 ABI.  The entry
> point for intra-module calls can assume r2 is already set up correctly.
> The entry point for inter-module calls starts two instructions earlier
> and adjusts r2 before falling through to the intra-module entry point.
>
> Normally, r2 is supposed to be preserved across calls.  For intra-module
> calls, there's no problem.  For inter-module calls, the PLT stub saves
> the caller's r2 value to a slot in the caller's stack frame that's
> required to be reserved for it, at r1+24.  The linker then inserts code
> in the caller to restore the value from the stack immediately after the
> call.
>
> So what's happening here is that the value of r2 that setjmp saves and
> that longjmp restores is the TOC pointer for libc, as set up by the PLT
> stub.  It's not the value of r2 that the caller had.  But that's
> normally fine -- after the second return from setjmp, the caller will
> restore its TOC pointer from the stack where it had been saved by the
> PLT stub when it originally called setjmp.  But in this example, gcc
> decides to allocate the 256 bytes overtop the part of the stack where
> the setjmp PLT stub had saved the TOC pointer, so it gets clobbered.
>
> The problem is that static linking and dynamic linking need to work
> differently.  With dynamic linking, we can fix this by changing setjmp
> to read the caller's TOC pointer from the reserved slot in the caller's
> stack frame, and longjmp to restore it to the stack instead of to r2.
>
> But with static linking, there's no PLT stub or code added by the linker
> to restore the TOC pointer from the stack, so we need to save/restore
> from/to r2, not the TOC slot in the caller's stack from.
>
> I think this either requires having different versions of setjmp/longjmp
> for static and dynamic libc, or to increase the size of jmpbuf so we can
> always save/restore both r2 and the value on the stack, but this would
> be an ABI change.

The analysis is correct.  Quoting my colleague:

"If glibc is built as a static library, the contents of r2 are saved
in the jmp_buf; but if glibc is built as a dynamic library, the
contents of the TOC save slot is saved in the jmp_buf.   Similarly, if
glibc is built as a dynamic library, longjmp *updates* the TOC save
slot with the r2 value from the jmp_buf before returning."

GLIBC setjmp/longjmp code explicitly differs for shared and static
versions of the library.  Musl libc needs equivalent functionality in
its implementation.

Thanks, David


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-01  5:28     ` Alexander Monakov
@ 2017-08-01 22:45       ` Rich Felker
  2017-08-01 23:07         ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2017-08-01 22:45 UTC (permalink / raw)
  To: musl

On Tue, Aug 01, 2017 at 08:28:27AM +0300, Alexander Monakov wrote:
> On Tue, 1 Aug 2017, Bobby Bingham wrote:
> > I think this either requires having different versions of setjmp/longjmp
> > for static and dynamic libc,
> 
> Do you mean for non-pic vs pic objects? As I understand, when libc.a is
> built with -fpic (so it's suitable for static-pie), setjmp-longjmp need
> to preserve saved TOC at (r1+24). So presumably source code would need
> to test #ifdef __PIC__?
> 
> > or to increase the size of jmpbuf so we can always save/restore both
> > r2 and the value on the stack, but this would be an ABI change.
> 
> Would that work for non-pic, i.e. is (r1+24) a reserved location even in
> non-pic mode? If not, you can't overwrite it from longjmp.

Pretty much certainly so; there is no separate "non-PIC ABI". PIC code
is just code that doesn't happen to do certain things not permissible
in PIC. It doesn't have additional permissions to do things that
otherwise wouldn't be permitted in "non-PIC code".

In any case just saving and restoring both is not an ABI change, since
there's plenty of free space (896 bits worth of non-existant signals)
in the jmp_buf due to the "Hurd sigset_t" mess.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-01 22:45       ` Rich Felker
@ 2017-08-01 23:07         ` Rich Felker
  2017-08-02  0:28           ` Bobby Bingham
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2017-08-01 23:07 UTC (permalink / raw)
  To: musl

On Tue, Aug 01, 2017 at 06:45:33PM -0400, Rich Felker wrote:
> On Tue, Aug 01, 2017 at 08:28:27AM +0300, Alexander Monakov wrote:
> > On Tue, 1 Aug 2017, Bobby Bingham wrote:
> > > I think this either requires having different versions of setjmp/longjmp
> > > for static and dynamic libc,
> > 
> > Do you mean for non-pic vs pic objects? As I understand, when libc.a is
> > built with -fpic (so it's suitable for static-pie), setjmp-longjmp need
> > to preserve saved TOC at (r1+24). So presumably source code would need
> > to test #ifdef __PIC__?
> > 
> > > or to increase the size of jmpbuf so we can always save/restore both
> > > r2 and the value on the stack, but this would be an ABI change.
> > 
> > Would that work for non-pic, i.e. is (r1+24) a reserved location even in
> > non-pic mode? If not, you can't overwrite it from longjmp.
> 
> Pretty much certainly so; there is no separate "non-PIC ABI". PIC code
> is just code that doesn't happen to do certain things not permissible
> in PIC. It doesn't have additional permissions to do things that
> otherwise wouldn't be permitted in "non-PIC code".
> 
> In any case just saving and restoring both is not an ABI change, since
> there's plenty of free space (896 bits worth of non-existant signals)
> in the jmp_buf due to the "Hurd sigset_t" mess.

It might also be possible to manually create both the entry points for
setjmp, rather than letting the assembler auto-generate them, in which
case I think the choice of which value to save just depends on which
entry point was used. Thoughts?

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-01 23:07         ` Rich Felker
@ 2017-08-02  0:28           ` Bobby Bingham
  2017-08-02  3:55             ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Bobby Bingham @ 2017-08-02  0:28 UTC (permalink / raw)
  To: musl

On Tue, Aug 01, 2017 at 07:07:59PM -0400, Rich Felker wrote:
> On Tue, Aug 01, 2017 at 06:45:33PM -0400, Rich Felker wrote:
> > On Tue, Aug 01, 2017 at 08:28:27AM +0300, Alexander Monakov wrote:
> > > On Tue, 1 Aug 2017, Bobby Bingham wrote:
> > > > I think this either requires having different versions of setjmp/longjmp
> > > > for static and dynamic libc,
> > >
> > > Do you mean for non-pic vs pic objects? As I understand, when libc.a is
> > > built with -fpic (so it's suitable for static-pie), setjmp-longjmp need
> > > to preserve saved TOC at (r1+24). So presumably source code would need
> > > to test #ifdef __PIC__?
> > >
> > > > or to increase the size of jmpbuf so we can always save/restore both
> > > > r2 and the value on the stack, but this would be an ABI change.
> > >
> > > Would that work for non-pic, i.e. is (r1+24) a reserved location even in
> > > non-pic mode? If not, you can't overwrite it from longjmp.
> >
> > Pretty much certainly so; there is no separate "non-PIC ABI". PIC code
> > is just code that doesn't happen to do certain things not permissible
> > in PIC. It doesn't have additional permissions to do things that
> > otherwise wouldn't be permitted in "non-PIC code".
> >
> > In any case just saving and restoring both is not an ABI change, since
> > there's plenty of free space (896 bits worth of non-existant signals)
> > in the jmp_buf due to the "Hurd sigset_t" mess.
>
> It might also be possible to manually create both the entry points for
> setjmp, rather than letting the assembler auto-generate them, in which
> case I think the choice of which value to save just depends on which
> entry point was used. Thoughts?

I like this idea.  It's slightly more complicated than that because of
the call to setjmp from sigsetjmp, but should still be ok.  I'll work on
a patch.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-02  0:28           ` Bobby Bingham
@ 2017-08-02  3:55             ` Rich Felker
  2017-08-02  4:31               ` Bobby Bingham
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2017-08-02  3:55 UTC (permalink / raw)
  To: musl

On Tue, Aug 01, 2017 at 07:28:45PM -0500, Bobby Bingham wrote:
> On Tue, Aug 01, 2017 at 07:07:59PM -0400, Rich Felker wrote:
> > On Tue, Aug 01, 2017 at 06:45:33PM -0400, Rich Felker wrote:
> > > On Tue, Aug 01, 2017 at 08:28:27AM +0300, Alexander Monakov wrote:
> > > > On Tue, 1 Aug 2017, Bobby Bingham wrote:
> > > > > I think this either requires having different versions of setjmp/longjmp
> > > > > for static and dynamic libc,
> > > >
> > > > Do you mean for non-pic vs pic objects? As I understand, when libc.a is
> > > > built with -fpic (so it's suitable for static-pie), setjmp-longjmp need
> > > > to preserve saved TOC at (r1+24). So presumably source code would need
> > > > to test #ifdef __PIC__?
> > > >
> > > > > or to increase the size of jmpbuf so we can always save/restore both
> > > > > r2 and the value on the stack, but this would be an ABI change.
> > > >
> > > > Would that work for non-pic, i.e. is (r1+24) a reserved location even in
> > > > non-pic mode? If not, you can't overwrite it from longjmp.
> > >
> > > Pretty much certainly so; there is no separate "non-PIC ABI". PIC code
> > > is just code that doesn't happen to do certain things not permissible
> > > in PIC. It doesn't have additional permissions to do things that
> > > otherwise wouldn't be permitted in "non-PIC code".
> > >
> > > In any case just saving and restoring both is not an ABI change, since
> > > there's plenty of free space (896 bits worth of non-existant signals)
> > > in the jmp_buf due to the "Hurd sigset_t" mess.
> >
> > It might also be possible to manually create both the entry points for
> > setjmp, rather than letting the assembler auto-generate them, in which
> > case I think the choice of which value to save just depends on which
> > entry point was used. Thoughts?
> 
> I like this idea.  It's slightly more complicated than that because of
> the call to setjmp from sigsetjmp, but should still be ok.  I'll work on
> a patch.

Hmm, can you elaborate on the situation with sigsetjmp?

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-02  3:55             ` Rich Felker
@ 2017-08-02  4:31               ` Bobby Bingham
  2017-08-02  4:58                 ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Bobby Bingham @ 2017-08-02  4:31 UTC (permalink / raw)
  To: musl

On Tue, Aug 01, 2017 at 11:55:56PM -0400, Rich Felker wrote:
> On Tue, Aug 01, 2017 at 07:28:45PM -0500, Bobby Bingham wrote:
> > On Tue, Aug 01, 2017 at 07:07:59PM -0400, Rich Felker wrote:
> > > On Tue, Aug 01, 2017 at 06:45:33PM -0400, Rich Felker wrote:
> > > > On Tue, Aug 01, 2017 at 08:28:27AM +0300, Alexander Monakov wrote:
> > > > > On Tue, 1 Aug 2017, Bobby Bingham wrote:
> > > > > > I think this either requires having different versions of setjmp/longjmp
> > > > > > for static and dynamic libc,
> > > > >
> > > > > Do you mean for non-pic vs pic objects? As I understand, when libc.a is
> > > > > built with -fpic (so it's suitable for static-pie), setjmp-longjmp need
> > > > > to preserve saved TOC at (r1+24). So presumably source code would need
> > > > > to test #ifdef __PIC__?
> > > > >
> > > > > > or to increase the size of jmpbuf so we can always save/restore both
> > > > > > r2 and the value on the stack, but this would be an ABI change.
> > > > >
> > > > > Would that work for non-pic, i.e. is (r1+24) a reserved location even in
> > > > > non-pic mode? If not, you can't overwrite it from longjmp.
> > > >
> > > > Pretty much certainly so; there is no separate "non-PIC ABI". PIC code
> > > > is just code that doesn't happen to do certain things not permissible
> > > > in PIC. It doesn't have additional permissions to do things that
> > > > otherwise wouldn't be permitted in "non-PIC code".
> > > >
> > > > In any case just saving and restoring both is not an ABI change, since
> > > > there's plenty of free space (896 bits worth of non-existant signals)
> > > > in the jmp_buf due to the "Hurd sigset_t" mess.
> > >
> > > It might also be possible to manually create both the entry points for
> > > setjmp, rather than letting the assembler auto-generate them, in which
> > > case I think the choice of which value to save just depends on which
> > > entry point was used. Thoughts?
> >
> > I like this idea.  It's slightly more complicated than that because of
> > the call to setjmp from sigsetjmp, but should still be ok.  I'll work on
> > a patch.
>
> Hmm, can you elaborate on the situation with sigsetjmp?
>

sigsetjmp calls setjmp, but I believe this will always use the intra-dso
entry point.  Same for the call siglongjmp makes to longjmp.  So calls
via sigsetjmp/siglongjmp will always be detected as local calls, even
when the originally caller of jig*jmp is in a different dso.

My plan right now is create a __setjmp_toc function which is identical
to the normal setjmp except that the TOC pointer to save is passed in as
another parameter.  setjmp will detect which entry point is used, pull
the TOC pointer from the right place, and call __setjmp_toc.  sigsetjmp
will be updated similarly to detect which entry point is used and to
call __setjmp_toc directly instead of going through setjmp.

siglongjmp is current written in C by just calling longjmp.  I'm tempted
to just add a "siglongjmp:" label in the asm for longjmp and add an
empty powerpc64/siglongjmp.c file to suppress the default
implementation.  I want to ask if there's any reason it wouldn't be
valid for these two functions to have the same address.

> Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-02  4:31               ` Bobby Bingham
@ 2017-08-02  4:58                 ` Rich Felker
  2017-08-02 13:38                   ` Bobby Bingham
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2017-08-02  4:58 UTC (permalink / raw)
  To: musl

On Tue, Aug 01, 2017 at 11:31:55PM -0500, Bobby Bingham wrote:
> On Tue, Aug 01, 2017 at 11:55:56PM -0400, Rich Felker wrote:
> > On Tue, Aug 01, 2017 at 07:28:45PM -0500, Bobby Bingham wrote:
> > > On Tue, Aug 01, 2017 at 07:07:59PM -0400, Rich Felker wrote:
> > > > On Tue, Aug 01, 2017 at 06:45:33PM -0400, Rich Felker wrote:
> > > > > On Tue, Aug 01, 2017 at 08:28:27AM +0300, Alexander Monakov wrote:
> > > > > > On Tue, 1 Aug 2017, Bobby Bingham wrote:
> > > > > > > I think this either requires having different versions of setjmp/longjmp
> > > > > > > for static and dynamic libc,
> > > > > >
> > > > > > Do you mean for non-pic vs pic objects? As I understand, when libc.a is
> > > > > > built with -fpic (so it's suitable for static-pie), setjmp-longjmp need
> > > > > > to preserve saved TOC at (r1+24). So presumably source code would need
> > > > > > to test #ifdef __PIC__?
> > > > > >
> > > > > > > or to increase the size of jmpbuf so we can always save/restore both
> > > > > > > r2 and the value on the stack, but this would be an ABI change.
> > > > > >
> > > > > > Would that work for non-pic, i.e. is (r1+24) a reserved location even in
> > > > > > non-pic mode? If not, you can't overwrite it from longjmp.
> > > > >
> > > > > Pretty much certainly so; there is no separate "non-PIC ABI". PIC code
> > > > > is just code that doesn't happen to do certain things not permissible
> > > > > in PIC. It doesn't have additional permissions to do things that
> > > > > otherwise wouldn't be permitted in "non-PIC code".
> > > > >
> > > > > In any case just saving and restoring both is not an ABI change, since
> > > > > there's plenty of free space (896 bits worth of non-existant signals)
> > > > > in the jmp_buf due to the "Hurd sigset_t" mess.
> > > >
> > > > It might also be possible to manually create both the entry points for
> > > > setjmp, rather than letting the assembler auto-generate them, in which
> > > > case I think the choice of which value to save just depends on which
> > > > entry point was used. Thoughts?
> > >
> > > I like this idea.  It's slightly more complicated than that because of
> > > the call to setjmp from sigsetjmp, but should still be ok.  I'll work on
> > > a patch.
> >
> > Hmm, can you elaborate on the situation with sigsetjmp?
> >
> 
> sigsetjmp calls setjmp, but I believe this will always use the intra-dso
> entry point.  Same for the call siglongjmp makes to longjmp.  So calls
> via sigsetjmp/siglongjmp will always be detected as local calls, even
> when the originally caller of jig*jmp is in a different dso.
> 
> My plan right now is create a __setjmp_toc function which is identical
> to the normal setjmp except that the TOC pointer to save is passed in as
> another parameter.  setjmp will detect which entry point is used, pull
> the TOC pointer from the right place, and call __setjmp_toc.  sigsetjmp
> will be updated similarly to detect which entry point is used and to
> call __setjmp_toc directly instead of going through setjmp.

I've been thinking about it and at first thought it sounded overly
fragile and hard to understand, but now I think it makes sense and
should work. It would just involve copying r2 to a call-clobbered
argument register before loading the new value, right?

I was considering whether you could just avoid loading the TOC pointer
at all (leaving the correct value in r2 for setjmp to save), and this
might work, but I think it would make calling __sigsetjmp_tail
difficult and error-prone.

> siglongjmp is current written in C by just calling longjmp.  I'm tempted
> to just add a "siglongjmp:" label in the asm for longjmp and add an
> empty powerpc64/siglongjmp.c file to suppress the default
> implementation.  I want to ask if there's any reason it wouldn't be
> valid for these two functions to have the same address.

I don't see any reason to make this change (it won't make any
functional difference -- call frames and such don't matter at this
point), and at least the siglongjmp symbol would have to be weak to
respect namespace if you did it that way.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-02  4:58                 ` Rich Felker
@ 2017-08-02 13:38                   ` Bobby Bingham
  2017-08-02 14:46                     ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Bobby Bingham @ 2017-08-02 13:38 UTC (permalink / raw)
  To: musl

On Wed, Aug 02, 2017 at 12:58:16AM -0400, Rich Felker wrote:
> > sigsetjmp calls setjmp, but I believe this will always use the intra-dso
> > entry point.  Same for the call siglongjmp makes to longjmp.  So calls
> > via sigsetjmp/siglongjmp will always be detected as local calls, even
> > when the originally caller of jig*jmp is in a different dso.
> >
> > My plan right now is create a __setjmp_toc function which is identical
> > to the normal setjmp except that the TOC pointer to save is passed in as
> > another parameter.  setjmp will detect which entry point is used, pull
> > the TOC pointer from the right place, and call __setjmp_toc.  sigsetjmp
> > will be updated similarly to detect which entry point is used and to
> > call __setjmp_toc directly instead of going through setjmp.
>
> I've been thinking about it and at first thought it sounded overly
> fragile and hard to understand, but now I think it makes sense and
> should work. It would just involve copying r2 to a call-clobbered
> argument register before loading the new value, right?

I'm not sure what "new value" you're referring to here.

The idea is basically:

	setjmp: # non-local entry point
		r5 = r1[24]
		goto __setjmp_toc

		.localentry # local entry point
		r5 = r2

	__setjmp_toc:
		# all the existing code from setjmp, but save r5 instead of r2

>
> I was considering whether you could just avoid loading the TOC pointer
> at all (leaving the correct value in r2 for setjmp to save), and this
> might work, but I think it would make calling __sigsetjmp_tail
> difficult and error-prone.
>
> > siglongjmp is current written in C by just calling longjmp.  I'm tempted
> > to just add a "siglongjmp:" label in the asm for longjmp and add an
> > empty powerpc64/siglongjmp.c file to suppress the default
> > implementation.  I want to ask if there's any reason it wouldn't be
> > valid for these two functions to have the same address.
>
> I don't see any reason to make this change (it won't make any
> functional difference -- call frames and such don't matter at this
> point), and at least the siglongjmp symbol would have to be weak to
> respect namespace if you did it that way.

I'm not sure why a change like this wouldn't be required.

The requirements on longjmp here are:
* when called through the local entry point, restore the TOC pointer
  into r2
* when called via the PLT stub, restore the TOC pointer to the stack

And siglongjmp needs to have the same behavior.  If the main program
makes a cross-dso call to siglongjmp, it needs to restore the TOC
pointer to the stack.  But siglongjmp works by making a local call to
longjmp, meaning without this change, it will only ever restore the TOC
pointer to r2.

>
> Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-02 13:38                   ` Bobby Bingham
@ 2017-08-02 14:46                     ` Rich Felker
  2017-08-03  0:19                       ` Bobby Bingham
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2017-08-02 14:46 UTC (permalink / raw)
  To: musl

On Wed, Aug 02, 2017 at 08:38:25AM -0500, Bobby Bingham wrote:
> On Wed, Aug 02, 2017 at 12:58:16AM -0400, Rich Felker wrote:
> > > sigsetjmp calls setjmp, but I believe this will always use the intra-dso
> > > entry point.  Same for the call siglongjmp makes to longjmp.  So calls
> > > via sigsetjmp/siglongjmp will always be detected as local calls, even
> > > when the originally caller of jig*jmp is in a different dso.
> > >
> > > My plan right now is create a __setjmp_toc function which is identical
> > > to the normal setjmp except that the TOC pointer to save is passed in as
> > > another parameter.  setjmp will detect which entry point is used, pull
> > > the TOC pointer from the right place, and call __setjmp_toc.  sigsetjmp
> > > will be updated similarly to detect which entry point is used and to
> > > call __setjmp_toc directly instead of going through setjmp.
> >
> > I've been thinking about it and at first thought it sounded overly
> > fragile and hard to understand, but now I think it makes sense and
> > should work. It would just involve copying r2 to a call-clobbered
> > argument register before loading the new value, right?
> 
> I'm not sure what "new value" you're referring to here.
> 
> The idea is basically:
> 
> 	setjmp: # non-local entry point
> 		r5 = r1[24]
> 		goto __setjmp_toc
> 
> 		.localentry # local entry point
> 		r5 = r2
> 
> 	__setjmp_toc:
> 		# all the existing code from setjmp, but save r5 instead of r2

For sigsetjmp, I think in this case you also need to duplicate the
assembler-generated code that would load a new r2; otherwise you can't
subsequently call __sigsetjmp_tail. For setjmp itself the above should
suffice since setjmp does not need a TOC itself.

Otherwise the pseudo-code above looks like what I expected after
thinking about it for a bit.

> > I was considering whether you could just avoid loading the TOC pointer
> > at all (leaving the correct value in r2 for setjmp to save), and this
> > might work, but I think it would make calling __sigsetjmp_tail
> > difficult and error-prone.
> >
> > > siglongjmp is current written in C by just calling longjmp.  I'm tempted
> > > to just add a "siglongjmp:" label in the asm for longjmp and add an
> > > empty powerpc64/siglongjmp.c file to suppress the default
> > > implementation.  I want to ask if there's any reason it wouldn't be
> > > valid for these two functions to have the same address.
> >
> > I don't see any reason to make this change (it won't make any
> > functional difference -- call frames and such don't matter at this
> > point), and at least the siglongjmp symbol would have to be weak to
> > respect namespace if you did it that way.
> 
> I'm not sure why a change like this wouldn't be required.
> 
> The requirements on longjmp here are:
> * when called through the local entry point, restore the TOC pointer
>   into r2
> * when called via the PLT stub, restore the TOC pointer to the stack
> 
> And siglongjmp needs to have the same behavior.  If the main program
> makes a cross-dso call to siglongjmp, it needs to restore the TOC
> pointer to the stack.  But siglongjmp works by making a local call to
> longjmp, meaning without this change, it will only ever restore the TOC
> pointer to r2.

Whether the call to longjmp/siglongjmp was local or not is irrelevant.
It's only whether the original call to setjmp/sigsetjmp was local or
not that's relevant. And in either case I'm pretty sure it suffices to
restore the saved value to both *(r1+24) and r2. Per the ABI, *(r1+24)
can't be used for any purpose except saving the TOC, so upon return
from setjmp, the caller's only options are to treat the value at
*(r1+24) as indeterminate or assume it contains the TOC pointer.
Likewise for r2, if the call was non-local, r2 is call-clobbered so it
doesn't matter what it contains after return, and if the call was
local, r2 is expected to contain the caller's TOC pointer.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-01 15:33     ` David Edelsohn
@ 2017-08-02 23:00       ` Alexander Monakov
  2017-08-02 23:02         ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Monakov @ 2017-08-02 23:00 UTC (permalink / raw)
  To: musl

On Tue, 1 Aug 2017, David Edelsohn wrote:
> "If glibc is built as a static library, the contents of r2 are saved
> in the jmp_buf; but if glibc is built as a dynamic library, the
> contents of the TOC save slot is saved in the jmp_buf.   Similarly, if
> glibc is built as a dynamic library, longjmp *updates* the TOC save
> slot with the r2 value from the jmp_buf before returning."
> 
> GLIBC setjmp/longjmp code explicitly differs for shared and static
> versions of the library.  Musl libc needs equivalent functionality in
> its implementation.

Note that since Glibc also supports static dlopen, it is possible to arrive
at a situation where libc.a longjmp is used for returning to a call site
of libc.so setjmp, in which case TOC save slot is not restored as it
ought to be, and the caller of setjmp segfaults. A testcase is available at
https://sourceware.org/bugzilla/show_bug.cgi?id=21895

Thanks.
Alexander


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-02 23:00       ` Alexander Monakov
@ 2017-08-02 23:02         ` Rich Felker
  0 siblings, 0 replies; 16+ messages in thread
From: Rich Felker @ 2017-08-02 23:02 UTC (permalink / raw)
  To: musl

On Thu, Aug 03, 2017 at 02:00:03AM +0300, Alexander Monakov wrote:
> On Tue, 1 Aug 2017, David Edelsohn wrote:
> > "If glibc is built as a static library, the contents of r2 are saved
> > in the jmp_buf; but if glibc is built as a dynamic library, the
> > contents of the TOC save slot is saved in the jmp_buf.   Similarly, if
> > glibc is built as a dynamic library, longjmp *updates* the TOC save
> > slot with the r2 value from the jmp_buf before returning."
> > 
> > GLIBC setjmp/longjmp code explicitly differs for shared and static
> > versions of the library.  Musl libc needs equivalent functionality in
> > its implementation.
> 
> Note that since Glibc also supports static dlopen, it is possible to arrive
> at a situation where libc.a longjmp is used for returning to a call site
> of libc.so setjmp, in which case TOC save slot is not restored as it
> ought to be, and the caller of setjmp segfaults. A testcase is available at
> https://sourceware.org/bugzilla/show_bug.cgi?id=21895

Thanks for investigating and writing this up.

Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: possible bug in setjmp implementation for ppc64
  2017-08-02 14:46                     ` Rich Felker
@ 2017-08-03  0:19                       ` Bobby Bingham
  0 siblings, 0 replies; 16+ messages in thread
From: Bobby Bingham @ 2017-08-03  0:19 UTC (permalink / raw)
  To: musl

On Wed, Aug 02, 2017 at 10:46:12AM -0400, Rich Felker wrote:
> On Wed, Aug 02, 2017 at 08:38:25AM -0500, Bobby Bingham wrote:
> Whether the call to longjmp/siglongjmp was local or not is irrelevant.
> It's only whether the original call to setjmp/sigsetjmp was local or
> not that's relevant. And in either case I'm pretty sure it suffices to

I think I was treating whether longjmp is called locally as a proxy for
whether setjmp was called locally.  But of course that doesn't work.

I think we're on the same page now.

> restore the saved value to both *(r1+24) and r2. Per the ABI, *(r1+24)
> can't be used for any purpose except saving the TOC, so upon return
> from setjmp, the caller's only options are to treat the value at
> *(r1+24) as indeterminate or assume it contains the TOC pointer.
> Likewise for r2, if the call was non-local, r2 is call-clobbered so it
> doesn't matter what it contains after return, and if the call was
> local, r2 is expected to contain the caller's TOC pointer.
>
> Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-08-03  0:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-31 20:06 possible bug in setjmp implementation for ppc64 felix.winkelmann
2017-07-31 20:30 ` Rich Felker
2017-08-01  5:10   ` Bobby Bingham
2017-08-01  5:28     ` Alexander Monakov
2017-08-01 22:45       ` Rich Felker
2017-08-01 23:07         ` Rich Felker
2017-08-02  0:28           ` Bobby Bingham
2017-08-02  3:55             ` Rich Felker
2017-08-02  4:31               ` Bobby Bingham
2017-08-02  4:58                 ` Rich Felker
2017-08-02 13:38                   ` Bobby Bingham
2017-08-02 14:46                     ` Rich Felker
2017-08-03  0:19                       ` Bobby Bingham
2017-08-01 15:33     ` David Edelsohn
2017-08-02 23:00       ` Alexander Monakov
2017-08-02 23:02         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).