From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12843 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: TLS issue on aarch64 Date: Sat, 26 May 2018 20:34:30 -0400 Message-ID: <20180527003430.GG1392@brightrain.aerifal.cx> References: <20180525145059.GG4418@port70.net> <20180526005415.GI4418@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1527381158 986 195.159.176.226 (27 May 2018 00:32:38 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 27 May 2018 00:32:38 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12859-gllmg-musl=m.gmane.org@lists.openwall.com Sun May 27 02:32:34 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1fMjbu-00009Q-Hj for gllmg-musl@m.gmane.org; Sun, 27 May 2018 02:32:34 +0200 Original-Received: (qmail 28429 invoked by uid 550); 27 May 2018 00:34:43 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 28408 invoked from network); 27 May 2018 00:34:42 -0000 Content-Disposition: inline In-Reply-To: <20180526005415.GI4418@port70.net> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12843 Archived-At: On Sat, May 26, 2018 at 02:54:16AM +0200, Szabolcs Nagy wrote: > * Phillip Berndt [2018-05-26 00:20:04 +0200]: > > 2018-05-25 16:50 GMT+02:00 Szabolcs Nagy : > > > i think the constraints for tp are: > > > > > > - tp must be aligned to 'tls_align' > > > > > > - tp must be at a small fixed offset from the end > > > of pthread struct (so asm code can access the dtv) > > > > > > - tp + off must be usable memory for tls for off >= 16 > > > (this is aarch64 specific) > > > > > > > Hmm.. but these constraints do not explain the extra offset of one > > alignment I'm seeing in the GCC output, do they? If I compile a > > tp must be aligned and tp + offset must be aligned too, > but offset >= 16 has to hold. > > > program with a single TLS variable with > > __attribute__((aligned(n)) that does nothing but try to reference and > > print said variable, I get the > > following assembler code from GCC: > > > > For n = 0x1000: > > > > 400194: d53bd041 mrs x1, tpidr_el0 > > 400198: b0000040 adrp x0, 409000 <__subtf3+0xd18> > > 40019c: 91400421 add x1, x1, #0x1, lsl #12 > > 4001a0: 91000021 add x1, x1, #0x0 > > > > > > For n = 0x100: > > > > 400194: d53bd041 mrs x1, tpidr_el0 > > 400198: b0000040 adrp x0, 409000 <__subtf3+0xd18> > > 40019c: 91400021 add x1, x1, #0x0, lsl #12 > > 4001a0: 91040021 add x1, x1, #0x100 > > > > For n = 0x10: > > > > 400194: d53bd041 mrs x1, tpidr_el0 > > 400198: b0000040 adrp x0, 409000 <__subtf3+0xd18> > > 40019c: 91400021 add x1, x1, #0x0, lsl #12 > > 4001a0: 91004021 add x1, x1, #0x10 > > > > That's how I came up with the mem += libc.tls_align hack in the first place. > > > > indeed you need another alignment there, i came up with the > following fix: > > (on mips/ppc i expect it not to change anything: tp is > at a page aligned offset from the end of struct pthread, > so one alignment is enough there, but on aarch64/arm/sh4 > this makes a difference, and seems to pass my simple tests) > > diff --git a/src/env/__init_tls.c b/src/env/__init_tls.c > index 1c5d98a0..8e70024d 100644 > --- a/src/env/__init_tls.c > +++ b/src/env/__init_tls.c > @@ -41,9 +41,12 @@ void *__copy_tls(unsigned char *mem) > #ifdef TLS_ABOVE_TP > dtv = (void **)(mem + libc.tls_size) - (libc.tls_cnt + 1); > > - mem += -((uintptr_t)mem + sizeof(struct pthread)) & (libc.tls_align-1); > + /* Ensure TP is aligned. */ > + mem += -(uintptr_t)TP_ADJ(mem) & (libc.tls_align-1); > td = (pthread_t)mem; > mem += sizeof(struct pthread); > + /* Ensure TLS is aligned after struct pthread. */ > + mem += -(uintptr_t)mem & (libc.tls_align-1); > > for (i=1, p=libc.tls_head; p; i++, p=p->next) { > dtv[i] = mem + p->offset; As written this (or anything using libc.tls_align to adjust offset of the TLS from the TP) is not valid. The value of libc.tls_align is runtime-variable and will increase upon dlopen, and even without dlopen, will be non-deterministic dependent on shared libraries from DT_NEEDED in dynamic-linked programs. The offset between TP and TLS is a property of the linker's handling of local-exec TLS in the main program only, and thus probably should be using libc.tls_head.align. However, care needs to be taken that libc.tls_head may initially be null if the main program has no TLS, but could later become non-null due to dlopen. If the offset between TP and TLS changed due to this, any initial-exec-model TLS access would be wrong. Fortunately such a program cannot have initial-exec-model accesses (initial-exec is only valid for TLS that existed at program start), so we can probably just ignore the issue and always use libc.tls_head?libc.tls_head.align:1; this will cause gratuitous padding for threads created after dlopen of a library with larger alignment, but should otherwise not hurt anything. Rich