mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: TLS issue on aarch64
Date: Thu, 31 May 2018 20:52:00 -0400	[thread overview]
Message-ID: <20180601005200.GO1392@brightrain.aerifal.cx> (raw)
In-Reply-To: <20180601001102.GL4418@port70.net>

On Fri, Jun 01, 2018 at 02:11:02AM +0200, Szabolcs Nagy wrote:
> * Szabolcs Nagy <nsz@port70.net> [2018-05-29 08:33:17 +0200]:
> > * Rich Felker <dalias@libc.org> [2018-05-28 18:15:21 -0400]:
> > > On Mon, May 28, 2018 at 10:47:31PM +0200, Szabolcs Nagy wrote:
> > > > another issue with the patch is that if tp is aligned then pthread_t
> > > > may not get aligned:
> > > > 
> > > > tp == self + sizeof(pthread_t) - reserved
> > > 
> > > This expression does not look correct; it would have tp point
> > > somewhere inside of the pthread structure rather than just past the
> > > end of it.
> > > 
> > 
> > this is the current tp setup on
> > aarch64, arm and sh4, see TP_ADJ
> > 
> > we can just make tp = self+sizeof(pthread_t)
> > but then there will be a small unused hole
> > 
> > > Maybe your code is doing this to avoid wasted padding, but if so I
> > > think that's a bad idea. It violates the invariant that the pthread
> > > structure is at a constant offset from tp, which is needed for
> > > efficient pthread_self and for access to fixed slots at the end of it
> > > (like canary or dtv copy).
> > > 
> > > > so sizeof(pthread_t) - reserved must be divisible with
> > > > gcd(alignment of tp, alignof(pthread_t)) to be able to make both
> > > > self and tp aligned.
> > > > 
> > > > this is not an issue on current targets with current pthread_t,
> > > > but we may want to decouple internal struct pthread alignment
> > > > details and the abi reserved tls size, i.e. tp_adj could be like
> > > > 
> > > > tp == alignup(self + sizeof(pthread_t) - reserved, alignof(pthread_t))
> > > > 
> > > > or we add a static assert that reserved and alignof(pthread_t)
> > > > are not conflicting.
> > > 
> > > Maybe I'm misunderstanding what "reserved" is, since you're talking
> > > about a static assert...?
> > > 
> > 
> > it is the abi reserved space after tp
> > (the bfd linker calls it TCB_SIZE)
> 
> did some digging into the bfd linker code, i dump here my understanding:
> 
> tls variant 1 has two sub variants: ppc (mips is the same) and
> aarch64 (arm, sh are the same just s/16/8/ for the reserved space)

This is very helpful.

> base: tls segment base address in the elf obj
> align: tls segment alignment requirement in the elf obj
> addr: address of a tls var in the tls segment of the elf obj
> tp: thread pointer at runtime
> ptr: address of tls var at runtime
> 
> local-exec (off: offset from tp in the code):
> 	code: ptr = tp + off
> 
> 	ppc: off = addr - (base + 0x7000)
> 	aarch64: off = addr - (base - alignup(16, align))
> 
> initial-exec (got,add: REL_TPOFF got entry and value/addend):
> [...]
> local-dynamic (non-tlsdesc, got,add: REL_DTPOFF got entry and value, off: offset in the code):
> [...]
> general-dynamic (non-tlsdesc, got,add: REL_DTPOFF got entry and value):
> [...]

None but local-exec are really linker ABI, except kinda in the case of
static linking if the linker fails to perform relaxations. A more
precise way to say this would be that only TLS which is accessible via
local-exec model has any constraints imposed on it by the linker as to
where it lies relative to TP. For initial-exec, the address relative
to TP must be constant across all threads, but it's not set by the
linker (except in the case of failed relaxation, in which case it's
just the local-exec model with the offset stored out-of-band). For
other models, there are no constraints at all; it's entirely up to the
dynamic linker implementation how it arranges for things to work.

> ptr has correct alignment if the tls segment is mapped to an aligned address,
> i.e. ptr - (addr - base) must be aligned, working backwards from this
> the requirement is
> 
> for local-exec to work:
> 	ppc: tp - 0x7000 must be aligned

This should be fine, since musl arranges for self+sizeof(struct
pthread) to be aligned, and that's exactly what tp-0x7000 is.

> 	aarch64: tp + alignup(16, align) must be aligned == tp must be aligned

OK, I see two possible solutions here:

1. tp==self+sizeof(struct pthread). In this case we'll waste some
space (vs the current approach) when no extra alignment is needed, but
it's simple and clean because all the alignments match up naturally.

2. tp==self+sizeof(struct pthread)-16 (or rather -reserved in
general). This preserves the current memory usage, but requires
complex new alignment logic since self will no longer be aligned mod
tls_align when tls_align>reserved.

I pretty strongly prefer option 1.

In either case, the main_tls.offset/app.tls.offset value needs to
correctly reflect the offset of the TLS from TP, so it either needs to
be alignup(reserved,tls_align) or alignup(reserved,tls_align)-reserved
depending on option 1 or 2. After that change is made, we need to make
sure the storage needs (libc.tls_size) are computed correctly and
account for the extra space due to the initial positive offset.

No change is then needed in __copy_tls.

Changes to TP_ADJ and __pthread_self are needed to get reserved out of
them, and the value of reserved needs to be provided somewhere else
for computing main_tls.offset.

> for initial-exec to work:
> 	tp + *got - add must be aligned (i.e. *got has to be set up to meet
> 	the alignment requirement of the module, this does not seem to require
> 	realignment of tp so even runtime loading of initial-exec tls should
> 	be possible assuming there is enough space etc...)

There's never space so it's not even a question, but even if there
were, no, it can't be done because tp will not be aligned mod some
possibly-larger alignment than the alignment in effect at the time the
thread was created.

> i can come up with a new patch, but it's not clear if on aarch64 we want
> to allow tp to point inside pthread_t or strictly point at the end (then
> the 16 reserved bytes are unused), note that in glibc the 16 byte after tp
> contains dtv and an unused ptr and it is expected that additional magic tls
> data the compiler needs to access (e.g. stackend for split stack support or
> stack canary) comes before tls (so the layout is pthread_t, some magic tls
> data, tp, 16 byte with dtv, tls), for us only the dtv is important to access
> with fixed offset from tp (even that's not necessary if we generate the load
> offset in the tlsdesc asm based on sizeof(pthread_t) etc)
> 
> i haven't looked into how gdb tries to handle tls (without threaddb)
> there might be further abi requirements for gdb to work well.

I don't think gdb is working reasonably at all right now for accessing
TLS on musl. It really should just be enhanced to inject calls to
__tls_get_addr and/or dlsym.

Rich


  reply	other threads:[~2018-06-01  0:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-25 12:40 Phillip Berndt
2018-05-25 14:50 ` Szabolcs Nagy
2018-05-25 21:38   ` Szabolcs Nagy
2018-05-25 22:20   ` Phillip Berndt
2018-05-26  0:54     ` Szabolcs Nagy
2018-05-26  8:24       ` Phillip Berndt
2018-05-27  0:34       ` Rich Felker
2018-05-28 20:47         ` Szabolcs Nagy
2018-05-28 22:15           ` Rich Felker
2018-05-29  6:33             ` Szabolcs Nagy
2018-05-31 15:22               ` Phillip Berndt
2018-06-01  0:11               ` Szabolcs Nagy
2018-06-01  0:52                 ` Rich Felker [this message]
2018-06-01  9:38                   ` Szabolcs Nagy
2018-05-27  0:17   ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180601005200.GO1392@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).