From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: TLS issue on aarch64
Date: Sat, 26 May 2018 20:34:30 -0400 [thread overview]
Message-ID: <20180527003430.GG1392@brightrain.aerifal.cx> (raw)
In-Reply-To: <20180526005415.GI4418@port70.net>
On Sat, May 26, 2018 at 02:54:16AM +0200, Szabolcs Nagy wrote:
> * Phillip Berndt <phillip.berndt@googlemail.com> [2018-05-26 00:20:04 +0200]:
> > 2018-05-25 16:50 GMT+02:00 Szabolcs Nagy <nsz@port70.net>:
> > > i think the constraints for tp are:
> > >
> > > - tp must be aligned to 'tls_align'
> > >
> > > - tp must be at a small fixed offset from the end
> > > of pthread struct (so asm code can access the dtv)
> > >
> > > - tp + off must be usable memory for tls for off >= 16
> > > (this is aarch64 specific)
> > >
> >
> > Hmm.. but these constraints do not explain the extra offset of one
> > alignment I'm seeing in the GCC output, do they? If I compile a
>
> tp must be aligned and tp + offset must be aligned too,
> but offset >= 16 has to hold.
>
> > program with a single TLS variable with
> > __attribute__((aligned(n)) that does nothing but try to reference and
> > print said variable, I get the
> > following assembler code from GCC:
> >
> > For n = 0x1000:
> >
> > 400194: d53bd041 mrs x1, tpidr_el0
> > 400198: b0000040 adrp x0, 409000 <__subtf3+0xd18>
> > 40019c: 91400421 add x1, x1, #0x1, lsl #12
> > 4001a0: 91000021 add x1, x1, #0x0
> >
> >
> > For n = 0x100:
> >
> > 400194: d53bd041 mrs x1, tpidr_el0
> > 400198: b0000040 adrp x0, 409000 <__subtf3+0xd18>
> > 40019c: 91400021 add x1, x1, #0x0, lsl #12
> > 4001a0: 91040021 add x1, x1, #0x100
> >
> > For n = 0x10:
> >
> > 400194: d53bd041 mrs x1, tpidr_el0
> > 400198: b0000040 adrp x0, 409000 <__subtf3+0xd18>
> > 40019c: 91400021 add x1, x1, #0x0, lsl #12
> > 4001a0: 91004021 add x1, x1, #0x10
> >
> > That's how I came up with the mem += libc.tls_align hack in the first place.
> >
>
> indeed you need another alignment there, i came up with the
> following fix:
>
> (on mips/ppc i expect it not to change anything: tp is
> at a page aligned offset from the end of struct pthread,
> so one alignment is enough there, but on aarch64/arm/sh4
> this makes a difference, and seems to pass my simple tests)
>
> diff --git a/src/env/__init_tls.c b/src/env/__init_tls.c
> index 1c5d98a0..8e70024d 100644
> --- a/src/env/__init_tls.c
> +++ b/src/env/__init_tls.c
> @@ -41,9 +41,12 @@ void *__copy_tls(unsigned char *mem)
> #ifdef TLS_ABOVE_TP
> dtv = (void **)(mem + libc.tls_size) - (libc.tls_cnt + 1);
>
> - mem += -((uintptr_t)mem + sizeof(struct pthread)) & (libc.tls_align-1);
> + /* Ensure TP is aligned. */
> + mem += -(uintptr_t)TP_ADJ(mem) & (libc.tls_align-1);
> td = (pthread_t)mem;
> mem += sizeof(struct pthread);
> + /* Ensure TLS is aligned after struct pthread. */
> + mem += -(uintptr_t)mem & (libc.tls_align-1);
>
> for (i=1, p=libc.tls_head; p; i++, p=p->next) {
> dtv[i] = mem + p->offset;
As written this (or anything using libc.tls_align to adjust offset of
the TLS from the TP) is not valid. The value of libc.tls_align is
runtime-variable and will increase upon dlopen, and even without
dlopen, will be non-deterministic dependent on shared libraries from
DT_NEEDED in dynamic-linked programs. The offset between TP and TLS is
a property of the linker's handling of local-exec TLS in the main
program only, and thus probably should be using libc.tls_head.align.
However, care needs to be taken that libc.tls_head may initially be
null if the main program has no TLS, but could later become non-null
due to dlopen. If the offset between TP and TLS changed due to this,
any initial-exec-model TLS access would be wrong. Fortunately such a
program cannot have initial-exec-model accesses (initial-exec is only
valid for TLS that existed at program start), so we can probably just
ignore the issue and always use libc.tls_head?libc.tls_head.align:1;
this will cause gratuitous padding for threads created after dlopen of
a library with larger alignment, but should otherwise not hurt
anything.
Rich
next prev parent reply other threads:[~2018-05-27 0:34 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-25 12:40 Phillip Berndt
2018-05-25 14:50 ` Szabolcs Nagy
2018-05-25 21:38 ` Szabolcs Nagy
2018-05-25 22:20 ` Phillip Berndt
2018-05-26 0:54 ` Szabolcs Nagy
2018-05-26 8:24 ` Phillip Berndt
2018-05-27 0:34 ` Rich Felker [this message]
2018-05-28 20:47 ` Szabolcs Nagy
2018-05-28 22:15 ` Rich Felker
2018-05-29 6:33 ` Szabolcs Nagy
2018-05-31 15:22 ` Phillip Berndt
2018-06-01 0:11 ` Szabolcs Nagy
2018-06-01 0:52 ` Rich Felker
2018-06-01 9:38 ` Szabolcs Nagy
2018-05-27 0:17 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180527003430.GG1392@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).