mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] aarch64 sigsetjmp relocation truncation bug, maybe
@ 2023-09-07  0:46 Peter Williams
  2023-09-07  1:01 ` [musl] " Peter Williams
  2023-09-07  3:08 ` [musl] " Markus Wichmann
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Williams @ 2023-09-07  0:46 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1922 bytes --]

Hello,

I'm experiencing a software issue that I think maybe, might, possibly,
indicate a musl bug. I'm not at all familiar with the issues involved,
but it looks like emailing this list might be my best bet for testing
that hypothesis, so that's what I'm doing. Please CC me on any replies
as I'm not subscribed, and accept my apologies if I'm way off base
here.

The short answer is that I'm trying to link a large (~100 MB) static
executable for aarch64 and getting this error out of the binutils
linker:

  /home/rust/sysroot-aarch64/usr/lib/libc.a(sigsetjmp.lo): in function `sigsetjmp':
  /home/buildozer/aports/main/musl/src/v1.2.3/src/signal/aarch64/sigsetjmp.s:7:(.text+0x0):
    relocation truncated to fit: R_AARCH64_CONDBR19 against symbol `setjmp'
    defined in .text section in /home/rust/sysroot-aarch64/usr/lib/libc.a(setjmp.lo)

If I'm understanding correctly, the complaint is that a branch in
sigsetjmp that invokes setjmp is too far away from the definition of
setjmp. My very handwavey idea is that maybe for some reason my program
is causing the linker to want to locate setjmp() and sigsetjmp() really
far away from each other. If that's right, perhaps it would be possible
to modify the assembler code to be able to handle such a situation?

My build environment is extremely gnarly — I am running my build tools
inside a cross-compiling Alpine Linux chroot that in turn runs inside
an Ubuntu Docker container. (You don't want to know.) I can provide
more details if needed, as well as a script that leads up to the error
on my system, if you have CPU time to burn on the Docker container
build. But since the error seems to be localized within my aarch64
"libc.a", I'm hopeful that maybe this funky environment doesn't
actually affect the core situation here. As you might see in the output
above, I'm using musl 1.2.3.

Thanks for any insight,

Peter


[-- Attachment #2: Type: text/html, Size: 2516 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [musl] Re: aarch64 sigsetjmp relocation truncation bug, maybe
  2023-09-07  0:46 [musl] aarch64 sigsetjmp relocation truncation bug, maybe Peter Williams
@ 2023-09-07  1:01 ` Peter Williams
  2023-09-07  3:08 ` [musl] " Markus Wichmann
  1 sibling, 0 replies; 7+ messages in thread
From: Peter Williams @ 2023-09-07  1:01 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2665 bytes --]

Of course, right after I send the email ... here's another piece of
evidence, which I think is consistent with my interpretation below.

If I simply add the following function to one of my C files:

void you_gotta_be_kidding_me(int arg)
{
    jmp_buf buf1;
    sigjmp_buf buf2;

    setjmp(buf1);
    sigsetjmp(buf2, arg);
}

... the linker error goes away. My hypothesis is/was that code like
this would encourage the linker to locate the two libc functions closer
together, so that the relocation is no longer truncated. So, it appears
that I at least have a workaround.

Peter

On Wed, 2023-09-06 at 20:46 -0400, Peter Williams wrote:
> Hello,
> 
> I'm experiencing a software issue that I think maybe, might,
> possibly, indicate a musl bug. I'm not at all familiar with the
> issues involved, but it looks like emailing this list might be my
> best bet for testing that hypothesis, so that's what I'm doing.
> Please CC me on any replies as I'm not subscribed, and accept my
> apologies if I'm way off base here.
> 
> The short answer is that I'm trying to link a large (~100 MB) static
> executable for aarch64 and getting this error out of the binutils
> linker:
> 
>   /home/rust/sysroot-aarch64/usr/lib/libc.a(sigsetjmp.lo): in function `sigsetjmp':
>   /home/buildozer/aports/main/musl/src/v1.2.3/src/signal/aarch64/sigsetjmp.s:7:(.text+0x0):
>     relocation truncated to fit: R_AARCH64_CONDBR19 against symbol `setjmp'
>     defined in .text section in /home/rust/sysroot-aarch64/usr/lib/libc.a(setjmp.lo)
> 
> If I'm understanding correctly, the complaint is that a branch in
> sigsetjmp that invokes setjmp is too far away from the definition of
> setjmp. My very handwavey idea is that maybe for some reason my
> program is causing the linker to want to locate setjmp() and
> sigsetjmp() really far away from each other. If that's right, perhaps
> it would be possible to modify the assembler code to be able to
> handle such a situation?
> 
> My build environment is extremely gnarly — I am running my build
> tools inside a cross-compiling Alpine Linux chroot that in turn runs
> inside an Ubuntu Docker container. (You don't want to know.) I can
> provide more details if needed, as well as a script that leads up to
> the error on my system, if you have CPU time to burn on the Docker
> container build. But since the error seems to be localized within my
> aarch64 "libc.a", I'm hopeful that maybe this funky environment
> doesn't actually affect the core situation here. As you might see in
> the output above, I'm using musl 1.2.3.
> 
> Thanks for any insight,
> 
> Peter
> 


[-- Attachment #2: Type: text/html, Size: 3748 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] aarch64 sigsetjmp relocation truncation bug, maybe
  2023-09-07  0:46 [musl] aarch64 sigsetjmp relocation truncation bug, maybe Peter Williams
  2023-09-07  1:01 ` [musl] " Peter Williams
@ 2023-09-07  3:08 ` Markus Wichmann
  2023-09-07 12:48   ` Rich Felker
  1 sibling, 1 reply; 7+ messages in thread
From: Markus Wichmann @ 2023-09-07  3:08 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 800 bytes --]

Am Wed, Sep 06, 2023 at 08:46:32PM -0400 schrieb Peter Williams:
> If I'm understanding correctly, the complaint is that a branch in
> sigsetjmp that invokes setjmp is too far away from the definition of
> setjmp. My very handwavey idea is that maybe for some reason my program
> is causing the linker to want to locate setjmp() and sigsetjmp() really
> far away from each other. If that's right, perhaps it would be possible
> to modify the assembler code to be able to handle such a situation?

I'm guessing the same. Pretty much all architectures have shorter
conditional than unconditional branches. That is why branches to other
files (technically to other sections) should always be unconditional. I
am attaching a simple patch that should help with the situation.

Ciao,
Markus

[-- Attachment #2: 0001-Make-branch-to-external-symbol-unconditional.patch --]
[-- Type: text/x-diff, Size: 966 bytes --]

From dd227e22a5337d54e1cb0838410bca6672c76c43 Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Thu, 7 Sep 2023 05:01:23 +0200
Subject: [PATCH] Make branch to external symbol unconditional.

Conditional branches have a shorter branch length than unconditional
ones, and almost all ABIs require unconditional branches to external
symbols. Otherwise linkers may create broken binaries.
---
 src/signal/aarch64/sigsetjmp.s | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/signal/aarch64/sigsetjmp.s b/src/signal/aarch64/sigsetjmp.s
index 75910c43..9a28e395 100644
--- a/src/signal/aarch64/sigsetjmp.s
+++ b/src/signal/aarch64/sigsetjmp.s
@@ -4,7 +4,7 @@
 .type __sigsetjmp,%function
 sigsetjmp:
 __sigsetjmp:
-	cbz x1,setjmp
+	cbz x1,1f

 	str x30,[x0,#176]
 	str x19,[x0,#176+8+8]
@@ -19,3 +19,5 @@ __sigsetjmp:

 .hidden __sigsetjmp_tail
 	b __sigsetjmp_tail
+
+1: b setjmp
--
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] aarch64 sigsetjmp relocation truncation bug, maybe
  2023-09-07  3:08 ` [musl] " Markus Wichmann
@ 2023-09-07 12:48   ` Rich Felker
  2023-09-07 13:28     ` Rich Felker
  2023-09-07 14:42     ` Markus Wichmann
  0 siblings, 2 replies; 7+ messages in thread
From: Rich Felker @ 2023-09-07 12:48 UTC (permalink / raw)
  To: musl

On Thu, Sep 07, 2023 at 05:08:13AM +0200, Markus Wichmann wrote:
> Am Wed, Sep 06, 2023 at 08:46:32PM -0400 schrieb Peter Williams:
> > If I'm understanding correctly, the complaint is that a branch in
> > sigsetjmp that invokes setjmp is too far away from the definition of
> > setjmp. My very handwavey idea is that maybe for some reason my program
> > is causing the linker to want to locate setjmp() and sigsetjmp() really
> > far away from each other. If that's right, perhaps it would be possible
> > to modify the assembler code to be able to handle such a situation?
> 
> I'm guessing the same. Pretty much all architectures have shorter
> conditional than unconditional branches. That is why branches to other
> files (technically to other sections) should always be unconditional. I
> am attaching a simple patch that should help with the situation.
> 
> Ciao,
> Markus

> From dd227e22a5337d54e1cb0838410bca6672c76c43 Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@gmx.net>
> Date: Thu, 7 Sep 2023 05:01:23 +0200
> Subject: [PATCH] Make branch to external symbol unconditional.
> 
> Conditional branches have a shorter branch length than unconditional
> ones, and almost all ABIs require unconditional branches to external
> symbols. Otherwise linkers may create broken binaries.
> ---
>  src/signal/aarch64/sigsetjmp.s | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/signal/aarch64/sigsetjmp.s b/src/signal/aarch64/sigsetjmp..s
> index 75910c43..9a28e395 100644
> --- a/src/signal/aarch64/sigsetjmp.s
> +++ b/src/signal/aarch64/sigsetjmp.s
> @@ -4,7 +4,7 @@
>  .type __sigsetjmp,%function
>  sigsetjmp:
>  __sigsetjmp:
> -	cbz x1,setjmp
> +	cbz x1,1f
> 
>  	str x30,[x0,#176]
>  	str x19,[x0,#176+8+8]
> @@ -19,3 +19,5 @@ __sigsetjmp:
> 
>  .hidden __sigsetjmp_tail
>  	b __sigsetjmp_tail
> +
> +1: b setjmp
> --
> 2.39.2

Are you sure this is the actual problem? I think it's that the aarch64
(and several other archs) version of sigsetjmp is wrongly using the
public setjmp symbol whose definition is possibly provided by a PLT
thunk in the main program, rather than either setjmp@PLT (which would
necessarily be the right local call point to use) or the hidden
___setjmp symbol that exists for this purpose (which i386, for
example, uses).

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] aarch64 sigsetjmp relocation truncation bug, maybe
  2023-09-07 12:48   ` Rich Felker
@ 2023-09-07 13:28     ` Rich Felker
  2023-09-07 19:49       ` Szabolcs Nagy
  2023-09-07 14:42     ` Markus Wichmann
  1 sibling, 1 reply; 7+ messages in thread
From: Rich Felker @ 2023-09-07 13:28 UTC (permalink / raw)
  To: musl; +Cc: Peter Williams

(Re-adding the OP to CC)

On Thu, Sep 07, 2023 at 08:48:28AM -0400, Rich Felker wrote:
> On Thu, Sep 07, 2023 at 05:08:13AM +0200, Markus Wichmann wrote:
> > Am Wed, Sep 06, 2023 at 08:46:32PM -0400 schrieb Peter Williams:
> > > If I'm understanding correctly, the complaint is that a branch in
> > > sigsetjmp that invokes setjmp is too far away from the definition of
> > > setjmp. My very handwavey idea is that maybe for some reason my program
> > > is causing the linker to want to locate setjmp() and sigsetjmp() really
> > > far away from each other. If that's right, perhaps it would be possible
> > > to modify the assembler code to be able to handle such a situation?
> > 
> > I'm guessing the same. Pretty much all architectures have shorter
> > conditional than unconditional branches. That is why branches to other
> > files (technically to other sections) should always be unconditional. I
> > am attaching a simple patch that should help with the situation.
> > 
> > Ciao,
> > Markus
> 
> > From dd227e22a5337d54e1cb0838410bca6672c76c43 Mon Sep 17 00:00:00 2001
> > From: Markus Wichmann <nullplan@gmx.net>
> > Date: Thu, 7 Sep 2023 05:01:23 +0200
> > Subject: [PATCH] Make branch to external symbol unconditional.
> > 
> > Conditional branches have a shorter branch length than unconditional
> > ones, and almost all ABIs require unconditional branches to external
> > symbols. Otherwise linkers may create broken binaries.
> > ---
> >  src/signal/aarch64/sigsetjmp.s | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/signal/aarch64/sigsetjmp.s b/src/signal/aarch64/sigsetjmp..s
> > index 75910c43..9a28e395 100644
> > --- a/src/signal/aarch64/sigsetjmp.s
> > +++ b/src/signal/aarch64/sigsetjmp.s
> > @@ -4,7 +4,7 @@
> >  .type __sigsetjmp,%function
> >  sigsetjmp:
> >  __sigsetjmp:
> > -	cbz x1,setjmp
> > +	cbz x1,1f
> > 
> >  	str x30,[x0,#176]
> >  	str x19,[x0,#176+8+8]
> > @@ -19,3 +19,5 @@ __sigsetjmp:
> > 
> >  .hidden __sigsetjmp_tail
> >  	b __sigsetjmp_tail
> > +
> > +1: b setjmp
> > --
> > 2.39.2
> 
> Are you sure this is the actual problem? I think it's that the aarch64
> (and several other archs) version of sigsetjmp is wrongly using the
> public setjmp symbol whose definition is possibly provided by a PLT
> thunk in the main program, rather than either setjmp@PLT (which would
> necessarily be the right local call point to use) or the hidden
> ___setjmp symbol that exists for this purpose (which i386, for
> example, uses).

Hmm, no, that seems to be a separate issue. The reported issue is
indeed a large static link where PLT stuff should not come into play.
So I think the cbz limited-range is indeed the issue.

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] aarch64 sigsetjmp relocation truncation bug, maybe
  2023-09-07 12:48   ` Rich Felker
  2023-09-07 13:28     ` Rich Felker
@ 2023-09-07 14:42     ` Markus Wichmann
  1 sibling, 0 replies; 7+ messages in thread
From: Markus Wichmann @ 2023-09-07 14:42 UTC (permalink / raw)
  To: musl; +Cc: Peter Williams

Am Thu, Sep 07, 2023 at 08:48:28AM -0400 schrieb Rich Felker:
> Are you sure this is the actual problem? I think it's that the aarch64
> (and several other archs) version of sigsetjmp is wrongly using the
> public setjmp symbol whose definition is possibly provided by a PLT
> thunk in the main program, rather than either setjmp@PLT (which would
> necessarily be the right local call point to use) or the hidden
> ___setjmp symbol that exists for this purpose (which i386, for
> example, uses).
>
> Rich

No I am not sure. I wrote that patch before heading to work, without
even test-compiling, and I don't know the first thing about arm64. But
every architecture I have ever looked into at any depth had a shorter
conditional branch than unconditional branch, and the linker normally
presumes to be able to rearrange input code sections at will, at least
for the branch length of an unconditional branch. Anything more usually
requires more specialized code and specialized options to the compiler.
That's why I wrote the patch in that way.

Of course you are right that I did not think about the PLT, or a
possible symbol interposition. However, the subroutine call to setjmp
that was already in sigsetjmp also didn't. And the prior version of the
code as well. So at least I didn't worsen the situation.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [musl] aarch64 sigsetjmp relocation truncation bug, maybe
  2023-09-07 13:28     ` Rich Felker
@ 2023-09-07 19:49       ` Szabolcs Nagy
  0 siblings, 0 replies; 7+ messages in thread
From: Szabolcs Nagy @ 2023-09-07 19:49 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl, Peter Williams

* Rich Felker <dalias@libc.org> [2023-09-07 09:28:52 -0400]:
> On Thu, Sep 07, 2023 at 08:48:28AM -0400, Rich Felker wrote:
> > On Thu, Sep 07, 2023 at 05:08:13AM +0200, Markus Wichmann wrote:
> > > Am Wed, Sep 06, 2023 at 08:46:32PM -0400 schrieb Peter Williams:
> > > > If I'm understanding correctly, the complaint is that a branch in
> > > > sigsetjmp that invokes setjmp is too far away from the definition of
> > > > setjmp. My very handwavey idea is that maybe for some reason my program
> > > > is causing the linker to want to locate setjmp() and sigsetjmp() really
> > > > far away from each other. If that's right, perhaps it would be possible
> > > > to modify the assembler code to be able to handle such a situation?
> > > 
> > > I'm guessing the same. Pretty much all architectures have shorter
> > > conditional than unconditional branches. That is why branches to other
> > > files (technically to other sections) should always be unconditional. I
> > > am attaching a simple patch that should help with the situation.
> > > 
> > > Ciao,
> > > Markus
> > 
> > > From dd227e22a5337d54e1cb0838410bca6672c76c43 Mon Sep 17 00:00:00 2001
> > > From: Markus Wichmann <nullplan@gmx.net>
> > > Date: Thu, 7 Sep 2023 05:01:23 +0200
> > > Subject: [PATCH] Make branch to external symbol unconditional.
> > > 
> > > Conditional branches have a shorter branch length than unconditional
> > > ones, and almost all ABIs require unconditional branches to external
> > > symbols. Otherwise linkers may create broken binaries.
> > > ---
> > >  src/signal/aarch64/sigsetjmp.s | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/signal/aarch64/sigsetjmp.s b/src/signal/aarch64/sigsetjmp..s
> > > index 75910c43..9a28e395 100644
> > > --- a/src/signal/aarch64/sigsetjmp.s
> > > +++ b/src/signal/aarch64/sigsetjmp.s
> > > @@ -4,7 +4,7 @@
> > >  .type __sigsetjmp,%function
> > >  sigsetjmp:
> > >  __sigsetjmp:
> > > -	cbz x1,setjmp
> > > +	cbz x1,1f
> > > 
> > >  	str x30,[x0,#176]
> > >  	str x19,[x0,#176+8+8]
> > > @@ -19,3 +19,5 @@ __sigsetjmp:
> > > 
> > >  .hidden __sigsetjmp_tail
> > >  	b __sigsetjmp_tail
> > > +
> > > +1: b setjmp

yeah, this looks good except the commit message can be improved.

conditional branch has +-1M reach, unconditional branch has +-128M
reach and the linker inserts a veneer if unconditional branch goes
further than that (i.e. it can go arbitrarily far).

setjmp is a local symbol to libc (or in case of static linking to
the exe), so using unconditional branch is perfectly fine if we
ensure the linker does not move the symbols far (e.g. putting the
two symbols in the same object file or into the same small section
is enough). this depends on that user code cannot interpose setjmp
(link namespace violation) and setjmp in libc does not use plt.

note that aarch64 libc.so text is < 1M so we can compile it with
-mcmodel=tiny code model and then the compiler would emit cond
branch for libc symbols. but .lo objects built like that are not
suitable for libc.a.

the simple b setjmp is the reasonable fix for now.

> > Are you sure this is the actual problem? I think it's that the aarch64
> > (and several other archs) version of sigsetjmp is wrongly using the
> > public setjmp symbol whose definition is possibly provided by a PLT
> > thunk in the main program, rather than either setjmp@PLT (which would
> > necessarily be the right local call point to use) or the hidden
> > ___setjmp symbol that exists for this purpose (which i386, for
> > example, uses).
> 
> Hmm, no, that seems to be a separate issue. The reported issue is
> indeed a large static link where PLT stuff should not come into play.
> So I think the cbz limited-range is indeed the issue.
> 
> Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-09-07 19:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-07  0:46 [musl] aarch64 sigsetjmp relocation truncation bug, maybe Peter Williams
2023-09-07  1:01 ` [musl] " Peter Williams
2023-09-07  3:08 ` [musl] " Markus Wichmann
2023-09-07 12:48   ` Rich Felker
2023-09-07 13:28     ` Rich Felker
2023-09-07 19:49       ` Szabolcs Nagy
2023-09-07 14:42     ` Markus Wichmann

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).