From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12487 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: TLS storage offsets for TLS_ABOVE_TP Date: Fri, 9 Feb 2018 15:03:22 -0500 Message-ID: <20180209200322.GJ1627@brightrain.aerifal.cx> References: Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1518206546 8624 195.159.176.226 (9 Feb 2018 20:02:26 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 9 Feb 2018 20:02:26 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12503-gllmg-musl=m.gmane.org@lists.openwall.com Fri Feb 09 21:02:22 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1ekErp-0000BX-In for gllmg-musl@m.gmane.org; Fri, 09 Feb 2018 21:01:53 +0100 Original-Received: (qmail 28337 invoked by uid 550); 9 Feb 2018 20:03:52 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 28160 invoked from network); 9 Feb 2018 20:03:34 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12487 Archived-At: On Fri, Feb 09, 2018 at 06:07:25PM +0000, Nicholas Wilson wrote: > Hi, > > I have a question about the support for TLS_ABOVE_TP ("TLS variant > I"). > > In archs like ARM, we have a matched pair of functions TP_ADJ and > __pthread_self, which adjust a pthread* to the thread-register and > back again. For ARM, there's an additional offset of 8 (and 16 for > AArch64), which is part of the ABI to ensure a) the TP points to the > DTV, and b) the main module's TLS block is at a known offset from > the TP. > > However, the ARM adjustment code uses "TP = pthread* + > sizeof(pthread) - 8". That's correct for the arch ABI, in that the > linker requires that the thread storage be located 8 bytes above the > TP, and Musl does indeed store the TLS block there right after the > struct pthread. > > But what's odd is that you have the pthread->dtv_copy member right > at the end of the pthread struct. So the thread pointer is pointing > not to pthread->dtv_copy, but actually to pthread->canary_at_end. dtv_copy at the end is just for internal use by ASM that needs to be able to find the dtv pointer. I don't think there's any code on ARM that uses it, but some archs might. > Is my mental compiler going wrong? I don't have an ARM machine to > actually execute the code on, but I've just been staring at it for > an hour and can't work it out. > > One thing I have noticed is that in all the four TLS models (general > dynamic, local dynamic, initial exec, local exec), the DTV isn't > actually dereferenced directly by compiler-generated code. The whole > thing about having the DTV available to the compiler in a known > location in fact is not used! So maybe that's how you get away with. This is exactly right. There is no valid way the compiler can generate code to access the DTV because it doesn't know the generation counter to compare against (on glibc). Drepper's TLS ABI document muddled a lot of things like this which are actually implementation details because he was documenting his implementation rather than the compiler-linker, linker-ldso, etc. ABI boundaries. This was an important observation at the time I implemented TLS in musl. Since access to the DTV is not actually ABI, its form is not fixed but an implementation detail, and we used a form that omits the glibc generation counters since we don't unload modules. > For the sake of "correctness" and conformance though, I wonder if > there should be a final "void *dtv_pad" member at the end of struct > pthread, so that the DTV block at the end of the struct pthread has > the right size for the platform. No, this would shift canary_at_end back, breaking the ABI for it -- and the canary-at-end *is* ABI on the archs that use it. (If in the future there are multiple incompatible places the canary could be across different archs, we'll have to adjust this section with preprocessor conditionals.) Alternatively we might be able to adjust the shift of struct __pthread relative to the TP per-arch. > I'd be happy to hear I'm wrong! (Maybe a diagram in the source code > would help comprehension, to show how the memory is laid out, > including the alignment bits and pieces. The FreeBSD libc source > code has a helpful bit of ASCII art that draws the layout of the > various bits of data and the alignment blocks between them.) It's not so much that you're wrong as that the TLS document is misleading. Rich