mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH] fix tls access on arm targets before armv6k
@ 2018-08-23 14:10 Szabolcs Nagy
  2018-08-24 19:27 ` [PATCH v2] " Szabolcs Nagy
  0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2018-08-23 14:10 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 216 bytes --]

fixing the 'large tls alignment on tls above tp targets' issue
introduced a regression on arm which i didnt notice because i
only tested it with armv7-a code gen.

fortunately there was no release since that commit.

[-- Attachment #2: 0001-fix-tls-access-on-arm-targets-before-armv6k.patch --]
[-- Type: text/x-diff, Size: 1037 bytes --]

From 539dc72e6d75779082c862b2bb4463c2af4b7953 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <nsz@port70.net>
Date: Thu, 23 Aug 2018 10:57:34 +0000
Subject: [PATCH] fix tls access on arm targets before armv6k

commit 610c5a8524c3d6cd3ac5a5f1231422e7648a3791 changed the thread
pointer setup so tp points at the end of the pthread struct on arm,
but failed to update __aeabi_read_tp so it was off by 8.

this broke tls access in code that is compiled with -mtp=soft, which
is the default when target arch is pre armv6k or thumb1.
---
 src/thread/arm/__aeabi_read_tp_c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/thread/arm/__aeabi_read_tp_c.c b/src/thread/arm/__aeabi_read_tp_c.c
index 654bdc57..0c56d613 100644
--- a/src/thread/arm/__aeabi_read_tp_c.c
+++ b/src/thread/arm/__aeabi_read_tp_c.c
@@ -4,5 +4,5 @@
 __attribute__((__visibility__("hidden")))
 void *__aeabi_read_tp_c(void)
 {
-	return (void *)((uintptr_t)__pthread_self()-8+sizeof(struct pthread));
+	return TP_ADJ(__pthread_self());
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] fix tls access on arm targets before armv6k
  2018-08-23 14:10 [PATCH] fix tls access on arm targets before armv6k Szabolcs Nagy
@ 2018-08-24 19:27 ` Szabolcs Nagy
  2018-08-24 20:17   ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2018-08-24 19:27 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 46 bytes --]

rewrote it in asm to fix a runtime-abi issue.

[-- Attachment #2: 0001-fix-tls-access-on-arm-targets-before-armv6k.patch --]
[-- Type: text/x-diff, Size: 2012 bytes --]

From 90da0eb3d0e92203d822ef9d5f96ef0e4e276ca2 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <nsz@port70.net>
Date: Thu, 23 Aug 2018 10:57:34 +0000
Subject: [PATCH] fix tls access on arm targets before armv6k

commit 610c5a8524c3d6cd3ac5a5f1231422e7648a3791 changed the thread
pointer setup so tp points at the end of the pthread struct on arm,
but failed to update __aeabi_read_tp so it was off by 8.

this broke tls access in code that is compiled with -mtp=soft, which
is the default when target arch is pre armv6k or thumb1.

__aeabi_read_tp used to call c code, but that was incorrect as the
arm runtime abi specifies special pcs for this function: it is only
allowed to clobber r0, ip, lr and cpsr.

if musl is compiled for a target with hardware tp then the code can
be further optimized so __aeabi_read_tp is an alias to __a_gettp_cp15,
but that requires preprocessed asm with conditionals, for now this
is a minimal bug fix.
---
 src/thread/arm/__aeabi_read_tp.s   | 10 ++++++----
 src/thread/arm/__aeabi_read_tp_c.c |  8 --------
 2 files changed, 6 insertions(+), 12 deletions(-)
 delete mode 100644 src/thread/arm/__aeabi_read_tp_c.c

diff --git a/src/thread/arm/__aeabi_read_tp.s b/src/thread/arm/__aeabi_read_tp.s
index 9d0cd311..2585620c 100644
--- a/src/thread/arm/__aeabi_read_tp.s
+++ b/src/thread/arm/__aeabi_read_tp.s
@@ -2,7 +2,9 @@
 .global __aeabi_read_tp
 .type __aeabi_read_tp,%function
 __aeabi_read_tp:
-	push {r1,r2,r3,lr}
-	bl __aeabi_read_tp_c
-	pop {r1,r2,r3,lr}
-	bx lr
+	ldr r0,1f
+	add r0,r0,pc
+	ldr r0,[r0]
+2:	bx r0
+	.align 2
+1:	.word __a_gettp_ptr - 2b
diff --git a/src/thread/arm/__aeabi_read_tp_c.c b/src/thread/arm/__aeabi_read_tp_c.c
deleted file mode 100644
index 654bdc57..00000000
--- a/src/thread/arm/__aeabi_read_tp_c.c
+++ /dev/null
@@ -1,8 +0,0 @@
-#include "pthread_impl.h"
-#include <stdint.h>
-
-__attribute__((__visibility__("hidden")))
-void *__aeabi_read_tp_c(void)
-{
-	return (void *)((uintptr_t)__pthread_self()-8+sizeof(struct pthread));
-}
-- 
2.18.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] fix tls access on arm targets before armv6k
  2018-08-24 19:27 ` [PATCH v2] " Szabolcs Nagy
@ 2018-08-24 20:17   ` Rich Felker
  2018-08-24 21:56     ` Szabolcs Nagy
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2018-08-24 20:17 UTC (permalink / raw)
  To: musl

On Fri, Aug 24, 2018 at 09:27:50PM +0200, Szabolcs Nagy wrote:
> rewrote it in asm to fix a runtime-abi issue.

I've already queued the previous fix. I could replace it since it's
not pushed, but I'd kinda prefer to make this change independent from
the bugfix since it's already been done in 2 steps anyway; normally I
don't like to mix bugfixes with refactoring except when it would be
hard to do the bugfix independently first.

> >From 90da0eb3d0e92203d822ef9d5f96ef0e4e276ca2 Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy <nsz@port70.net>
> Date: Thu, 23 Aug 2018 10:57:34 +0000
> Subject: [PATCH] fix tls access on arm targets before armv6k
> 
> commit 610c5a8524c3d6cd3ac5a5f1231422e7648a3791 changed the thread
> pointer setup so tp points at the end of the pthread struct on arm,
> but failed to update __aeabi_read_tp so it was off by 8.
> 
> this broke tls access in code that is compiled with -mtp=soft, which
> is the default when target arch is pre armv6k or thumb1.
> 
> __aeabi_read_tp used to call c code, but that was incorrect as the
> arm runtime abi specifies special pcs for this function: it is only
> allowed to clobber r0, ip, lr and cpsr.

Maybe keep just this paragraph if factoring as 2 commits?

> if musl is compiled for a target with hardware tp then the code can
> be further optimized so __aeabi_read_tp is an alias to __a_gettp_cp15,
> but that requires preprocessed asm with conditionals, for now this
> is a minimal bug fix.

I think this optimization isn't so important since, if the user is
compiling for a higher ISA level, they won't even be generating calls
to __aeabi_read_tp. It would marginally help the case where you
compile an optimized-for-native-cpu libc.so but use low-baseline
application binaries, but I doubt that's a case many users care about.

> ---
>  src/thread/arm/__aeabi_read_tp.s   | 10 ++++++----
>  src/thread/arm/__aeabi_read_tp_c.c |  8 --------
>  2 files changed, 6 insertions(+), 12 deletions(-)
>  delete mode 100644 src/thread/arm/__aeabi_read_tp_c.c
> 
> diff --git a/src/thread/arm/__aeabi_read_tp.s b/src/thread/arm/__aeabi_read_tp.s
> index 9d0cd311..2585620c 100644
> --- a/src/thread/arm/__aeabi_read_tp.s
> +++ b/src/thread/arm/__aeabi_read_tp.s
> @@ -2,7 +2,9 @@
>  .global __aeabi_read_tp
>  .type __aeabi_read_tp,%function
>  __aeabi_read_tp:
> -	push {r1,r2,r3,lr}
> -	bl __aeabi_read_tp_c
> -	pop {r1,r2,r3,lr}
> -	bx lr
> +	ldr r0,1f
> +	add r0,r0,pc
> +	ldr r0,[r0]
> +2:	bx r0
> +	.align 2
> +1:	.word __a_gettp_ptr - 2b

Isn't there an idiomatic "adr" pseudo-mnemonic preferable to use
instead of the explicit pc arithmetic here? Or is that necessarily
less efficient?

> diff --git a/src/thread/arm/__aeabi_read_tp_c.c b/src/thread/arm/__aeabi_read_tp_c.c
> deleted file mode 100644
> index 654bdc57..00000000
> --- a/src/thread/arm/__aeabi_read_tp_c.c
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -#include "pthread_impl.h"
> -#include <stdint.h>
> -
> -__attribute__((__visibility__("hidden")))
> -void *__aeabi_read_tp_c(void)
> -{
> -	return (void *)((uintptr_t)__pthread_self()-8+sizeof(struct pthread));
> -}
> -- 
> 2.18.0
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] fix tls access on arm targets before armv6k
  2018-08-24 20:17   ` Rich Felker
@ 2018-08-24 21:56     ` Szabolcs Nagy
  2018-08-25  0:06       ` [PATCH] rewrite __aeabi_read_tp in asm Szabolcs Nagy
  0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2018-08-24 21:56 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2018-08-24 16:17:36 -0400]:
> On Fri, Aug 24, 2018 at 09:27:50PM +0200, Szabolcs Nagy wrote:
> > rewrote it in asm to fix a runtime-abi issue.
> 
> I've already queued the previous fix. I could replace it since it's
> not pushed, but I'd kinda prefer to make this change independent from
> the bugfix since it's already been done in 2 steps anyway; normally I
> don't like to mix bugfixes with refactoring except when it would be
> hard to do the bugfix independently first.
> 

ok.

> >  __aeabi_read_tp:
> > -	push {r1,r2,r3,lr}
> > -	bl __aeabi_read_tp_c
> > -	pop {r1,r2,r3,lr}
> > -	bx lr
> > +	ldr r0,1f
> > +	add r0,r0,pc
> > +	ldr r0,[r0]
> > +2:	bx r0
> > +	.align 2
> > +1:	.word __a_gettp_ptr - 2b
> 
> Isn't there an idiomatic "adr" pseudo-mnemonic preferable to use
> instead of the explicit pc arithmetic here? Or is that necessarily
> less efficient?
> 

adr r,label sets r to the address of label within a small offset. i don't
think that helps, you can avoid using pc, but it will be more instructions:

	ldr r0,1f
	adr ip,1f // == add ip,pc,1f-2f
	add r0,ip
2:	ldr r0,[r0]
	bx r0
	.align 2
1:	.word __a_gettp_ptr - 1b

the main ugliness of the pc arithmetics is that pc evaluates to pc+8 in arm
mode but pc+4 in thumb, that's why i needed label 2, but i think it's fine.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] rewrite __aeabi_read_tp in asm
  2018-08-24 21:56     ` Szabolcs Nagy
@ 2018-08-25  0:06       ` Szabolcs Nagy
  0 siblings, 0 replies; 5+ messages in thread
From: Szabolcs Nagy @ 2018-08-25  0:06 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 622 bytes --]

* Szabolcs Nagy <nsz@port70.net> [2018-08-24 23:56:21 +0200]:
> * Rich Felker <dalias@libc.org> [2018-08-24 16:17:36 -0400]:
> > On Fri, Aug 24, 2018 at 09:27:50PM +0200, Szabolcs Nagy wrote:
> > > rewrote it in asm to fix a runtime-abi issue.
> > 
> > I've already queued the previous fix. I could replace it since it's
> > not pushed, but I'd kinda prefer to make this change independent from
> > the bugfix since it's already been done in 2 steps anyway; normally I
> > don't like to mix bugfixes with refactoring except when it would be
> > hard to do the bugfix independently first.

attached it as a separate patch.

[-- Attachment #2: 0001-rewrite-__aeabi_read_tp-in-asm.patch --]
[-- Type: text/x-diff, Size: 1412 bytes --]

From 9fcc75f35b9d7bb5af60c6d0f5ff32f11ca00802 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <nsz@port70.net>
Date: Fri, 24 Aug 2018 23:11:59 +0000
Subject: [PATCH] rewrite __aeabi_read_tp in asm

__aeabi_read_tp used to call c code, but that was incorrect as the
arm runtime abi specifies special pcs for this function: it is only
allowed to clobber r0, ip, lr and cpsr.
---
 src/thread/arm/__aeabi_read_tp.s   | 10 ++++++----
 src/thread/arm/__aeabi_read_tp_c.c |  8 --------
 2 files changed, 6 insertions(+), 12 deletions(-)
 delete mode 100644 src/thread/arm/__aeabi_read_tp_c.c

diff --git a/src/thread/arm/__aeabi_read_tp.s b/src/thread/arm/__aeabi_read_tp.s
index 9d0cd311..2585620c 100644
--- a/src/thread/arm/__aeabi_read_tp.s
+++ b/src/thread/arm/__aeabi_read_tp.s
@@ -2,7 +2,9 @@
 .global __aeabi_read_tp
 .type __aeabi_read_tp,%function
 __aeabi_read_tp:
-	push {r1,r2,r3,lr}
-	bl __aeabi_read_tp_c
-	pop {r1,r2,r3,lr}
-	bx lr
+	ldr r0,1f
+	add r0,r0,pc
+	ldr r0,[r0]
+2:	bx r0
+	.align 2
+1:	.word __a_gettp_ptr - 2b
diff --git a/src/thread/arm/__aeabi_read_tp_c.c b/src/thread/arm/__aeabi_read_tp_c.c
deleted file mode 100644
index 0c56d613..00000000
--- a/src/thread/arm/__aeabi_read_tp_c.c
+++ /dev/null
@@ -1,8 +0,0 @@
-#include "pthread_impl.h"
-#include <stdint.h>
-
-__attribute__((__visibility__("hidden")))
-void *__aeabi_read_tp_c(void)
-{
-	return TP_ADJ(__pthread_self());
-}
-- 
2.18.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-25  0:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 14:10 [PATCH] fix tls access on arm targets before armv6k Szabolcs Nagy
2018-08-24 19:27 ` [PATCH v2] " Szabolcs Nagy
2018-08-24 20:17   ` Rich Felker
2018-08-24 21:56     ` Szabolcs Nagy
2018-08-25  0:06       ` [PATCH] rewrite __aeabi_read_tp in asm Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).