* [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
@ 2024-11-23 0:20 Alex Rønne Petersen
2024-11-23 8:30 ` Alexander Monakov
0 siblings, 1 reply; 9+ messages in thread
From: Alex Rønne Petersen @ 2024-11-23 0:20 UTC (permalink / raw)
To: musl; +Cc: Alex Rønne Petersen
Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc. This fixes a
linker error when building musl libc.so with zig cc.
---
src/thread/s390x/__tls_get_offset.s | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/thread/s390x/__tls_get_offset.s b/src/thread/s390x/__tls_get_offset.s
index 8ee92de8..2e0913cc 100644
--- a/src/thread/s390x/__tls_get_offset.s
+++ b/src/thread/s390x/__tls_get_offset.s
@@ -5,6 +5,7 @@ __tls_get_offset:
aghi %r15, -160
la %r2, 0(%r2, %r12)
+.hidden __tls_get_addr
brasl %r14, __tls_get_addr
ear %r1, %a0
--
2.40.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-23 0:20 [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it Alex Rønne Petersen
@ 2024-11-23 8:30 ` Alexander Monakov
2024-11-23 12:15 ` Alex Rønne Petersen
0 siblings, 1 reply; 9+ messages in thread
From: Alexander Monakov @ 2024-11-23 8:30 UTC (permalink / raw)
To: musl; +Cc: Alex Rønne Petersen
[-- Attachment #1: Type: text/plain, Size: 929 bytes --]
On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc. This fixes a
> linker error when building musl libc.so with zig cc.
Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
hidden in libc.so. Unusual.
(linkers must take the most restrictive visibility from all mentions of a symbol)
I'm curious, what kind of error with zig cc were you seeing?
Thanks.
Alexander
> ---
> src/thread/s390x/__tls_get_offset.s | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/src/thread/s390x/__tls_get_offset.s b/src/thread/s390x/__tls_get_offset.s
> index 8ee92de8..2e0913cc 100644
> --- a/src/thread/s390x/__tls_get_offset.s
> +++ b/src/thread/s390x/__tls_get_offset.s
> @@ -5,6 +5,7 @@ __tls_get_offset:
> aghi %r15, -160
>
> la %r2, 0(%r2, %r12)
> +.hidden __tls_get_addr
> brasl %r14, __tls_get_addr
>
> ear %r1, %a0
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-23 8:30 ` Alexander Monakov
@ 2024-11-23 12:15 ` Alex Rønne Petersen
2024-11-23 12:36 ` Alexander Monakov
0 siblings, 1 reply; 9+ messages in thread
From: Alex Rønne Petersen @ 2024-11-23 12:15 UTC (permalink / raw)
To: Alexander Monakov; +Cc: musl
On Sat, Nov 23, 2024 at 9:30 AM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
>
> > Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc. This fixes a
> > linker error when building musl libc.so with zig cc.
>
> Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
> hidden in libc.so. Unusual.
>
> (linkers must take the most restrictive visibility from all mentions of a symbol)
>
> I'm curious, what kind of error with zig cc were you seeing?
This:
ld.lld: error: relocation R_390_PC32DBL cannot be used against symbol
'__tls_get_addr'; recompile with -fPIC
>>> defined in obj/src/thread/__tls_get_addr.lo
>>> referenced by __tls_get_offset.s:8 (src/thread/s390x/__tls_get_offset.s:8)
>>> obj/src/thread/s390x/__tls_get_offset.lo:(.text+0x10)
(-fPIC is actually in use.)
Presumably this could be fixed in lld, considering GNU ld seems fine
with it. But I figured that, since glibc also marks __tls_get_addr
hidden for s390x, musl should probably just do the same anyway.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-23 12:15 ` Alex Rønne Petersen
@ 2024-11-23 12:36 ` Alexander Monakov
2024-11-23 12:57 ` Alex Rønne Petersen
0 siblings, 1 reply; 9+ messages in thread
From: Alexander Monakov @ 2024-11-23 12:36 UTC (permalink / raw)
To: Alex Rønne Petersen; +Cc: musl
[-- Attachment #1: Type: text/plain, Size: 1569 bytes --]
On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> On Sat, Nov 23, 2024 at 9:30 AM Alexander Monakov <amonakov@ispras.ru> wrote:
> >
> > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> >
> > > Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc. This fixes a
> > > linker error when building musl libc.so with zig cc.
> >
> > Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
> > hidden in libc.so. Unusual.
> >
> > (linkers must take the most restrictive visibility from all mentions of a symbol)
> >
> > I'm curious, what kind of error with zig cc were you seeing?
>
> This:
>
> ld.lld: error: relocation R_390_PC32DBL cannot be used against symbol
> '__tls_get_addr'; recompile with -fPIC
> >>> defined in obj/src/thread/__tls_get_addr.lo
> >>> referenced by __tls_get_offset.s:8 (src/thread/s390x/__tls_get_offset.s:8)
> >>> obj/src/thread/s390x/__tls_get_offset.lo:(.text+0x10)
>
> (-fPIC is actually in use.)
>
> Presumably this could be fixed in lld, considering GNU ld seems fine
> with it. But I figured that, since glibc also marks __tls_get_addr
> hidden for s390x, musl should probably just do the same anyway.
I see, thanks. Your commit message was confusing to me, because unlike
__syscall_ret and the like, __tls_get_addr is not an internal helper,
it may not have hidden visibility anywhere except s390. So it felt like
the commit message was drawing a false parallel.
I would love this to land with a clearer commit message, but that's up
to Rich and yourself to sort out.
Alexander
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-23 12:36 ` Alexander Monakov
@ 2024-11-23 12:57 ` Alex Rønne Petersen
2024-11-29 13:48 ` Rich Felker
0 siblings, 1 reply; 9+ messages in thread
From: Alex Rønne Petersen @ 2024-11-23 12:57 UTC (permalink / raw)
To: Alexander Monakov; +Cc: musl
On Sat, Nov 23, 2024 at 1:36 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
>
> > On Sat, Nov 23, 2024 at 9:30 AM Alexander Monakov <amonakov@ispras.ru> wrote:
> > >
> > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > >
> > > > Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc. This fixes a
> > > > linker error when building musl libc.so with zig cc.
> > >
> > > Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
> > > hidden in libc.so. Unusual.
> > >
> > > (linkers must take the most restrictive visibility from all mentions of a symbol)
> > >
> > > I'm curious, what kind of error with zig cc were you seeing?
> >
> > This:
> >
> > ld.lld: error: relocation R_390_PC32DBL cannot be used against symbol
> > '__tls_get_addr'; recompile with -fPIC
> > >>> defined in obj/src/thread/__tls_get_addr.lo
> > >>> referenced by __tls_get_offset.s:8 (src/thread/s390x/__tls_get_offset.s:8)
> > >>> obj/src/thread/s390x/__tls_get_offset.lo:(.text+0x10)
> >
> > (-fPIC is actually in use.)
> >
> > Presumably this could be fixed in lld, considering GNU ld seems fine
> > with it. But I figured that, since glibc also marks __tls_get_addr
> > hidden for s390x, musl should probably just do the same anyway.
>
> I see, thanks. Your commit message was confusing to me, because unlike
> __syscall_ret and the like, __tls_get_addr is not an internal helper,
> it may not have hidden visibility anywhere except s390. So it felt like
> the commit message was drawing a false parallel.
>
> I would love this to land with a clearer commit message, but that's up
> to Rich and yourself to sort out.
Yeah, I think that's fair. I wrote the commit message before I
actually investigated in detail how __tls_get_addr is supposed to be
handled for s390x.
Should I re-send the patch with an updated commit message, or how is
this usually handled?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-23 12:57 ` Alex Rønne Petersen
@ 2024-11-29 13:48 ` Rich Felker
2024-11-29 19:49 ` Alex Rønne Petersen
0 siblings, 1 reply; 9+ messages in thread
From: Rich Felker @ 2024-11-29 13:48 UTC (permalink / raw)
To: Alex Rønne Petersen; +Cc: Alexander Monakov, musl
On Sat, Nov 23, 2024 at 01:57:16PM +0100, Alex Rønne Petersen wrote:
> On Sat, Nov 23, 2024 at 1:36 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> >
> > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> >
> > > On Sat, Nov 23, 2024 at 9:30 AM Alexander Monakov <amonakov@ispras.ru> wrote:
> > > >
> > > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > > >
> > > > > Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc. This fixes a
> > > > > linker error when building musl libc.so with zig cc.
> > > >
> > > > Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
> > > > hidden in libc.so. Unusual.
> > > >
> > > > (linkers must take the most restrictive visibility from all mentions of a symbol)
> > > >
> > > > I'm curious, what kind of error with zig cc were you seeing?
> > >
> > > This:
> > >
> > > ld.lld: error: relocation R_390_PC32DBL cannot be used against symbol
> > > '__tls_get_addr'; recompile with -fPIC
> > > >>> defined in obj/src/thread/__tls_get_addr.lo
> > > >>> referenced by __tls_get_offset.s:8 (src/thread/s390x/__tls_get_offset.s:8)
> > > >>> obj/src/thread/s390x/__tls_get_offset.lo:(.text+0x10)
> > >
> > > (-fPIC is actually in use.)
> > >
> > > Presumably this could be fixed in lld, considering GNU ld seems fine
> > > with it. But I figured that, since glibc also marks __tls_get_addr
> > > hidden for s390x, musl should probably just do the same anyway.
> >
> > I see, thanks. Your commit message was confusing to me, because unlike
> > __syscall_ret and the like, __tls_get_addr is not an internal helper,
> > it may not have hidden visibility anywhere except s390. So it felt like
> > the commit message was drawing a false parallel.
> >
> > I would love this to land with a clearer commit message, but that's up
> > to Rich and yourself to sort out.
>
> Yeah, I think that's fair. I wrote the commit message before I
> actually investigated in detail how __tls_get_addr is supposed to be
> handled for s390x.
>
> Should I re-send the patch with an updated commit message, or how is
> this usually handled?
While s390x doesn't need __tls_get_addr to be a public symbol, I'd
kinda prefer not to have an arch-specific hack to make it hidden.
Looking at the code, it's got to be significantly gratuitously slow
having __tls_get_offset making a second function call to
__tls_get_addr, setting up a stack frame and all.
The __tls_get_offset code dates back to 2016 when it was actually
necessary to call into C code in case new TLS needed to be installed.
Since 2019 (9d44b6460a) that's not necessary, so I think we could just
open code the asm for __tls_get_offset entirely and have it be
decently fast.
Rich
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-29 13:48 ` Rich Felker
@ 2024-11-29 19:49 ` Alex Rønne Petersen
2024-11-30 3:20 ` Rich Felker
0 siblings, 1 reply; 9+ messages in thread
From: Alex Rønne Petersen @ 2024-11-29 19:49 UTC (permalink / raw)
To: Rich Felker; +Cc: Alexander Monakov, musl
On Fri, Nov 29, 2024 at 2:48 PM Rich Felker <dalias@libc.org> wrote:
>
> On Sat, Nov 23, 2024 at 01:57:16PM +0100, Alex Rønne Petersen wrote:
> > On Sat, Nov 23, 2024 at 1:36 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> > >
> > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > >
> > > > On Sat, Nov 23, 2024 at 9:30 AM Alexander Monakov <amonakov@ispras.ru> wrote:
> > > > >
> > > > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > > > >
> > > > > > Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc. This fixes a
> > > > > > linker error when building musl libc.so with zig cc.
> > > > >
> > > > > Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
> > > > > hidden in libc.so. Unusual.
> > > > >
> > > > > (linkers must take the most restrictive visibility from all mentions of a symbol)
> > > > >
> > > > > I'm curious, what kind of error with zig cc were you seeing?
> > > >
> > > > This:
> > > >
> > > > ld.lld: error: relocation R_390_PC32DBL cannot be used against symbol
> > > > '__tls_get_addr'; recompile with -fPIC
> > > > >>> defined in obj/src/thread/__tls_get_addr.lo
> > > > >>> referenced by __tls_get_offset.s:8 (src/thread/s390x/__tls_get_offset.s:8)
> > > > >>> obj/src/thread/s390x/__tls_get_offset.lo:(.text+0x10)
> > > >
> > > > (-fPIC is actually in use.)
> > > >
> > > > Presumably this could be fixed in lld, considering GNU ld seems fine
> > > > with it. But I figured that, since glibc also marks __tls_get_addr
> > > > hidden for s390x, musl should probably just do the same anyway.
> > >
> > > I see, thanks. Your commit message was confusing to me, because unlike
> > > __syscall_ret and the like, __tls_get_addr is not an internal helper,
> > > it may not have hidden visibility anywhere except s390. So it felt like
> > > the commit message was drawing a false parallel.
> > >
> > > I would love this to land with a clearer commit message, but that's up
> > > to Rich and yourself to sort out.
> >
> > Yeah, I think that's fair. I wrote the commit message before I
> > actually investigated in detail how __tls_get_addr is supposed to be
> > handled for s390x.
> >
> > Should I re-send the patch with an updated commit message, or how is
> > this usually handled?
>
> While s390x doesn't need __tls_get_addr to be a public symbol, I'd
> kinda prefer not to have an arch-specific hack to make it hidden.
> Looking at the code, it's got to be significantly gratuitously slow
> having __tls_get_offset making a second function call to
> __tls_get_addr, setting up a stack frame and all.
>
> The __tls_get_offset code dates back to 2016 when it was actually
> necessary to call into C code in case new TLS needed to be installed.
> Since 2019 (9d44b6460a) that's not necessary, so I think we could just
> open code the asm for __tls_get_offset entirely and have it be
> decently fast.
That sounds reasonable. I don't have a ton of experience with writing
s390x assembly, though. I can do the obvious thing and extract the
compiled logic from __tls_get_addr without the calling convention
fluff. Would that be sufficient?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-29 19:49 ` Alex Rønne Petersen
@ 2024-11-30 3:20 ` Rich Felker
2024-11-30 17:51 ` Fangrui Song
0 siblings, 1 reply; 9+ messages in thread
From: Rich Felker @ 2024-11-30 3:20 UTC (permalink / raw)
To: Alex Rønne Petersen; +Cc: Alexander Monakov, musl
On Fri, Nov 29, 2024 at 08:49:00PM +0100, Alex Rønne Petersen wrote:
> On Fri, Nov 29, 2024 at 2:48 PM Rich Felker <dalias@libc.org> wrote:
> >
> > On Sat, Nov 23, 2024 at 01:57:16PM +0100, Alex Rønne Petersen wrote:
> > > On Sat, Nov 23, 2024 at 1:36 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> > > >
> > > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > > >
> > > > > On Sat, Nov 23, 2024 at 9:30 AM Alexander Monakov <amonakov@ispras.ru> wrote:
> > > > > >
> > > > > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > > > > >
> > > > > > > Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc.. This fixes a
> > > > > > > linker error when building musl libc.so with zig cc.
> > > > > >
> > > > > > Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
> > > > > > hidden in libc.so. Unusual.
> > > > > >
> > > > > > (linkers must take the most restrictive visibility from all mentions of a symbol)
> > > > > >
> > > > > > I'm curious, what kind of error with zig cc were you seeing?
> > > > >
> > > > > This:
> > > > >
> > > > > ld.lld: error: relocation R_390_PC32DBL cannot be used against symbol
> > > > > '__tls_get_addr'; recompile with -fPIC
> > > > > >>> defined in obj/src/thread/__tls_get_addr.lo
> > > > > >>> referenced by __tls_get_offset.s:8 (src/thread/s390x/__tls_get_offset.s:8)
> > > > > >>> obj/src/thread/s390x/__tls_get_offset.lo:(.text+0x10)
> > > > >
> > > > > (-fPIC is actually in use.)
> > > > >
> > > > > Presumably this could be fixed in lld, considering GNU ld seems fine
> > > > > with it. But I figured that, since glibc also marks __tls_get_addr
> > > > > hidden for s390x, musl should probably just do the same anyway.
> > > >
> > > > I see, thanks. Your commit message was confusing to me, because unlike
> > > > __syscall_ret and the like, __tls_get_addr is not an internal helper,
> > > > it may not have hidden visibility anywhere except s390. So it felt like
> > > > the commit message was drawing a false parallel.
> > > >
> > > > I would love this to land with a clearer commit message, but that's up
> > > > to Rich and yourself to sort out.
> > >
> > > Yeah, I think that's fair. I wrote the commit message before I
> > > actually investigated in detail how __tls_get_addr is supposed to be
> > > handled for s390x.
> > >
> > > Should I re-send the patch with an updated commit message, or how is
> > > this usually handled?
> >
> > While s390x doesn't need __tls_get_addr to be a public symbol, I'd
> > kinda prefer not to have an arch-specific hack to make it hidden.
> > Looking at the code, it's got to be significantly gratuitously slow
> > having __tls_get_offset making a second function call to
> > __tls_get_addr, setting up a stack frame and all.
> >
> > The __tls_get_offset code dates back to 2016 when it was actually
> > necessary to call into C code in case new TLS needed to be installed.
> > Since 2019 (9d44b6460a) that's not necessary, so I think we could just
> > open code the asm for __tls_get_offset entirely and have it be
> > decently fast.
>
> That sounds reasonable. I don't have a ton of experience with writing
> s390x assembly, though. I can do the obvious thing and extract the
> compiled logic from __tls_get_addr without the calling convention
> fluff. Would that be sufficient?
That's what I was looking at doing. Basically just compiling a
modified version of __tls_get_addr that subtracts the thread pointer,
then prepending the code to load the index address from the GOT
pointer argument in r12.
A further optimization later could be storing the address with tp
pre-subtracted in the dtv. This would also be optimal for archs with
TLSDESC support, at the expense of an extra addition in legacy
__tls_get_addr access. On some archs it may even save a temp register
in the TLSDESC function.
Rich
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it.
2024-11-30 3:20 ` Rich Felker
@ 2024-11-30 17:51 ` Fangrui Song
0 siblings, 0 replies; 9+ messages in thread
From: Fangrui Song @ 2024-11-30 17:51 UTC (permalink / raw)
To: musl; +Cc: Alex Rønne Petersen, Alexander Monakov
On Fri, Nov 29, 2024 at 7:20 PM Rich Felker <dalias@libc.org> wrote:
>
> On Fri, Nov 29, 2024 at 08:49:00PM +0100, Alex Rønne Petersen wrote:
> > On Fri, Nov 29, 2024 at 2:48 PM Rich Felker <dalias@libc.org> wrote:
> > >
> > > On Sat, Nov 23, 2024 at 01:57:16PM +0100, Alex Rønne Petersen wrote:
> > > > On Sat, Nov 23, 2024 at 1:36 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> > > > >
> > > > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > > > >
> > > > > > On Sat, Nov 23, 2024 at 9:30 AM Alexander Monakov <amonakov@ispras.ru> wrote:
> > > > > > >
> > > > > > > On Sat, 23 Nov 2024, Alex Rønne Petersen wrote:
> > > > > > >
> > > > > > > > Similar to what's done for __syscall_ret, __sigsetjmp_tail, etc.. This fixes a
> > > > > > > > linker error when building musl libc.so with zig cc.
> > > > > > >
> > > > > > > Hm, on s390 __tls_get_addr is not used for TLS ABI, so it's fine that it ends up
> > > > > > > hidden in libc.so. Unusual.
> > > > > > >
> > > > > > > (linkers must take the most restrictive visibility from all mentions of a symbol)
> > > > > > >
> > > > > > > I'm curious, what kind of error with zig cc were you seeing?
> > > > > >
> > > > > > This:
> > > > > >
> > > > > > ld.lld: error: relocation R_390_PC32DBL cannot be used against symbol
> > > > > > '__tls_get_addr'; recompile with -fPIC
> > > > > > >>> defined in obj/src/thread/__tls_get_addr.lo
> > > > > > >>> referenced by __tls_get_offset.s:8 (src/thread/s390x/__tls_get_offset.s:8)
> > > > > > >>> obj/src/thread/s390x/__tls_get_offset.lo:(.text+0x10)
> > > > > >
> > > > > > (-fPIC is actually in use.)
> > > > > >
> > > > > > Presumably this could be fixed in lld, considering GNU ld seems fine
> > > > > > with it. But I figured that, since glibc also marks __tls_get_addr
> > > > > > hidden for s390x, musl should probably just do the same anyway.
> > > > >
> > > > > I see, thanks. Your commit message was confusing to me, because unlike
> > > > > __syscall_ret and the like, __tls_get_addr is not an internal helper,
> > > > > it may not have hidden visibility anywhere except s390. So it felt like
> > > > > the commit message was drawing a false parallel.
> > > > >
> > > > > I would love this to land with a clearer commit message, but that's up
> > > > > to Rich and yourself to sort out.
> > > >
> > > > Yeah, I think that's fair. I wrote the commit message before I
> > > > actually investigated in detail how __tls_get_addr is supposed to be
> > > > handled for s390x.
> > > >
> > > > Should I re-send the patch with an updated commit message, or how is
> > > > this usually handled?
> > >
> > > While s390x doesn't need __tls_get_addr to be a public symbol, I'd
> > > kinda prefer not to have an arch-specific hack to make it hidden.
> > > Looking at the code, it's got to be significantly gratuitously slow
> > > having __tls_get_offset making a second function call to
> > > __tls_get_addr, setting up a stack frame and all.
> > >
> > > The __tls_get_offset code dates back to 2016 when it was actually
> > > necessary to call into C code in case new TLS needed to be installed.
> > > Since 2019 (9d44b6460a) that's not necessary, so I think we could just
> > > open code the asm for __tls_get_offset entirely and have it be
> > > decently fast.
> >
> > That sounds reasonable. I don't have a ton of experience with writing
> > s390x assembly, though. I can do the obvious thing and extract the
> > compiled logic from __tls_get_addr without the calling convention
> > fluff. Would that be sufficient?
>
> That's what I was looking at doing. Basically just compiling a
> modified version of __tls_get_addr that subtracts the thread pointer,
> then prepending the code to load the index address from the GOT
> pointer argument in r12.
>
> A further optimization later could be storing the address with tp
> pre-subtracted in the dtv. This would also be optimal for archs with
> TLSDESC support, at the expense of an extra addition in legacy
> __tls_get_addr access. On some archs it may even save a temp register
> in the TLSDESC function.
>
> Rich
(I am not versed in s390x assembly, but I have some notes about __tls_get_offset
https://maskray.me/blog/2024-02-11-toolchain-notes-on-z-architecture#general-dynamic-tls-model
The 32-bit ABI had to use __tls_get_offset because some nice
general-instructions-extension was unavailable when the ABI was
codified.
The 64-bit ABI following the 32-bit __tls_get_offset was just unfortunate.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-11-30 17:53 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-23 0:20 [musl] [PATCH] s390x: Mark __tls_get_addr hidden before invoking it Alex Rønne Petersen
2024-11-23 8:30 ` Alexander Monakov
2024-11-23 12:15 ` Alex Rønne Petersen
2024-11-23 12:36 ` Alexander Monakov
2024-11-23 12:57 ` Alex Rønne Petersen
2024-11-29 13:48 ` Rich Felker
2024-11-29 19:49 ` Alex Rønne Petersen
2024-11-30 3:20 ` Rich Felker
2024-11-30 17:51 ` Fangrui Song
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).