From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7751 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: ppc soft-float regression Date: Sun, 24 May 2015 20:36:48 -0400 Message-ID: <20150525003648.GO17573@brightrain.aerifal.cx> References: <20150517163723.GP17573@brightrain.aerifal.cx> <20150517175021.GA2171@euler> <20150517181556.GA23050@euler> <20150517195622.GA4761@euler> <20150518183929.GA6370@euler> <20150518201043.GX17573@brightrain.aerifal.cx> <20150518201422.GY17573@brightrain.aerifal.cx> <20150518220731.GA31132@euler> <20150522062346.GK17573@brightrain.aerifal.cx> <20150524030809.GA19134@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1432514228 6660 80.91.229.3 (25 May 2015 00:37:08 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 25 May 2015 00:37:08 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7763-gllmg-musl=m.gmane.org@lists.openwall.com Mon May 25 02:37:05 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YwgO7-0002Np-Lc for gllmg-musl@m.gmane.org; Mon, 25 May 2015 02:37:03 +0200 Original-Received: (qmail 16164 invoked by uid 550); 25 May 2015 00:37:02 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 16130 invoked from network); 25 May 2015 00:37:01 -0000 Content-Disposition: inline In-Reply-To: <20150524030809.GA19134@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7751 Archived-At: On Sat, May 23, 2015 at 11:08:09PM -0400, Rich Felker wrote: > > Attached is a patch that finishes the job by completing option 3. I > > haven't tested it much yet so I'll hold off on committing it for a > > while but it seems to work fine (not break anything) on i386. > > > > diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c > > index 93595a0..485bd4f 100644 > > --- a/src/ldso/dynlink.c > > +++ b/src/ldso/dynlink.c > > @@ -280,12 +280,17 @@ static void do_relocs(struct dso *dso, size_t *rel, size_t rel_size, size_t stri > > def.dso = dso; > > } > > > > - int gotplt = (type == REL_GOT || type == REL_PLT); > > - if (dso->rel_update_got && !gotplt && stride==2) continue; > > - > > - addend = stride>2 ? rel[2] > > - : gotplt || type==REL_COPY ? 0 > > - : *reloc_addr; > > + if (stride > 2) { > > + addend = rel[2]; > > + } else if (type==REL_GOT || type==REL_PLT || type==REL_COPY) { > > + addend = 0; > > + } else { > > + addend = *reloc_addr; > > + if (dso->rel_update_got) { > > + struct symdef old = find_sym(&ldso, name, 0); > > + addend -= (size_t)ldso.base+old.sym->st_value; > > + } > > + } > > Actually I'm not happy with this patch as-is. It's only valid for > REL_SYMBOLIC (or REL_SYM_OR_REL with a symbol) type relocations, > because it's assuming that the value at reloc_addr is sym_val+addend. > We could restrict reprocessing to just those types, but there are a > number of other reloc types that could theoretically arise and that we > should be treating correctly. REL_OFFSET/REL_OFFSET32 probably should > not appear in libc.so (or anything without TEXTRELs), but if we need > to support them, we would also need to adjust by (size_t)reloc_addr. > What's more important, though, are TLS-type relocations which in > principle could appear if libgcc.a is emulating floating point > environment for softfloat via TLS. REL_DTPOFF and REL_TLSDESC are > probably the only ones that would be valid here (only GD model is > valid in shared libraries) and REL_DTPOFF is trivial to reverse and > extract an addend, but REL_TLSDESC is relatively complex to handle. > > Sure we could just do REL_SYMBOLIC for now, but if we can't yet solve > the problem in a future-proof way, I'm not sure there's much value in > committing the patch at this point, since there's no present issue > it's fixing. I think I have a potential solution. What I've wanted to do is backup the original addends before stage 2 relocation, either by constructing a RELA table to replace the REL or just backing up the addends out-of-line and having special logic in do_relocs to pull them instead of the inline addends (the latter takes 1/3 the space, which is appealing). Unfortunately, this requires allocation of storage, which pessimizes archs where this whole issue doesn't matter -- extra syscalls (mmap) and/or large enough static storage to be safely above the possible size of the addends. There's a simple alternative I just came up with though: have dlstart.c compute the number of REL entries that need their addends saved and allocate a VLA on its stack for stages 2 and 3 to use. While the number of addends could be significant, it's many orders of magnitude smaller than the smallest practical stack sizes we could actually run with, so it's perfectly safe to put it on the stack. Here're the basic changes I'd like to make to dlstart.c to implement this: 1. Remove processing of DT_JMPREL REL/RELA table. All entries in this table are necessarily JMP_SLOT type, and thus symbolic, so there's nothing stage 1 can do with them anyway. Also, being JMP_SLOT type, they have implicit addends of zero if they're REL-type, so there's no need to save addends. 2. Remove the loop in dlstart.c that works like a fake function call 3 times to process DT_JMPREL, DT_REL, and DT_RELA. Instead we just need 2 iterations, and now the stride is constant in each, so they should simplify down a lot more inline. 3. During the loop that processes DT_REL, count the number of non-relative relocations (ones we skip at this stage), then make a VLA this size and pass its address to __dls2 as a second argument. 4. Have the do_relocs in stage 2 save addends in this provided array before overwriting them, and save its address for use by stage 3. 5. Have the do_relocs in stage 3 (for ldso/libc only) pull addends from this array instead of of from inline. Steps 1 and 2 are purely code removal/simplification and should be done regardless of whether we move forward on the above program, I think. Steps 3-5 add some complexity but hardly any code, just a few lines here and there. Comments? Rich