From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7348 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Dynamic linker changes Date: Sun, 5 Apr 2015 18:30:31 -0400 Message-ID: <20150405223031.GA29575@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1428273067 24345 80.91.229.3 (5 Apr 2015 22:31:07 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 5 Apr 2015 22:31:07 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7361-gllmg-musl=m.gmane.org@lists.openwall.com Mon Apr 06 00:30:52 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Yet48-0006r2-B4 for gllmg-musl@m.gmane.org; Mon, 06 Apr 2015 00:30:52 +0200 Original-Received: (qmail 29768 invoked by uid 550); 5 Apr 2015 22:30:50 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 29703 invoked from network); 5 Apr 2015 22:30:43 -0000 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7348 Archived-At: As part of the dynamic linker overhaul project for ssp-enabled libc.so, I'd like to make some somewhat unrelated changes to the dynamic linker. Some aspects of these are just general improvements, but most of them eliminate implementation snags I'm forseeing in the early-relocation code. Anyway, here they are: Revisiting how we find load base address: If the dynamic linker is invoked as the PT_INTERP for a program, it gets its own base address as AT_BASE in auxv. But if it's invoked directly, AT_BASE is empty, and we presently round down the AT_PHDR address to a page boundary and assume that's the load base. This is an ugly hack and not guaranteed to be correct (although it should be with any reasonable linker). A better approach is having the asm entry point for the dynamic linker compute the address of _DYNAMIC using its known PC-relative offset and pass this into the C code. The C code can then find the base-relative location of _DYNAMIC via PT_DYNAMIC in the program headers, and the difference between these two values (absolute address of _DYNAMIC and base-relative address of _DYNAMIC) is the base. Revisiting how ld.so skips argv entries: Presently, when invoked as a command, ld.so uses an ugly hack for stripping the beginning of argv[] before passing it to the main program entry point. It replaces slots with (char*)-1 and the calling asm is responsible for skipping over these before passing execution to the main program's entry point. This requires a lot of ugly arch-specific asm, and often this asm does not get tested early on since invocation of ld.so as a command is not a commonly used feature. A better approach would be making the C part of the dynamic linker never return, but instead call longjmp to pass execution to the main program's entry point. Provided we tell the dynamic linker where the PC and SP registers are located in jmp_buf, all it needs to do are store the AT_ENTRY address into the PC slot and the updated start address of the argv array in the SP slot, then call longjmp. Stripping down entry point asm further: Like the way crt_arch.h and crt1.c work for the main program entry point now, almost all asm can be eliminated from the dynamic linker entry point. All that's needed is some minimal asm to align SP and put the original SP value (and now, also the address of _DYNAMIC) in argument registers/slots and tail-call to the C code. The C code can be responsible for extracting argc out of the ELF argv array.