From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7753
Path: news.gmane.org!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: ppc soft-float regression
Date: Mon, 25 May 2015 02:57:56 -0400
Message-ID: <20150525065756.GR17573@brightrain.aerifal.cx>
References: <20150517181556.GA23050@euler>
 <20150517195622.GA4761@euler>
 <20150518183929.GA6370@euler>
 <20150518201043.GX17573@brightrain.aerifal.cx>
 <20150518201422.GY17573@brightrain.aerifal.cx>
 <20150518220731.GA31132@euler>
 <20150522062346.GK17573@brightrain.aerifal.cx>
 <20150524030809.GA19134@brightrain.aerifal.cx>
 <20150525003648.GO17573@brightrain.aerifal.cx>
 <1432535489.2715.1.camel@inria.fr>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: ger.gmane.org 1432537096 30503 80.91.229.3 (25 May 2015 06:58:16 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Mon, 25 May 2015 06:58:16 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-7765-gllmg-musl=m.gmane.org@lists.openwall.com Mon May 25 08:58:12 2015
Return-path: <musl-return-7765-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-7765-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1YwmKy-0001cT-2L
	for gllmg-musl@m.gmane.org; Mon, 25 May 2015 08:58:12 +0200
Original-Received: (qmail 5336 invoked by uid 550); 25 May 2015 06:58:10 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 5310 invoked from network); 25 May 2015 06:58:10 -0000
Content-Disposition: inline
In-Reply-To: <1432535489.2715.1.camel@inria.fr>
User-Agent: Mutt/1.5.21 (2010-09-15)
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:7753
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/7753>

On Mon, May 25, 2015 at 08:31:29AM +0200, Jens Gustedt wrote:
> Am Sonntag, den 24.05.2015, 20:36 -0400 schrieb Rich Felker:
> > There's a simple alternative I just came up with though: have
> > dlstart.c compute the number of REL entries that need their addends
> > saved and allocate a VLA on its stack for stages 2 and 3 to use. While
> > the number of addends could be significant, it's many orders of
> > magnitude smaller than the smallest practical stack sizes we could
> > actually run with, so it's perfectly safe to put it on the stack.
> > 
> > Here're the basic changes I'd like to make to dlstart.c to implement
> > this:
> > 
> > 1. Remove processing of DT_JMPREL REL/RELA table. All entries in this
> >    table are necessarily JMP_SLOT type, and thus symbolic, so there's
> >    nothing stage 1 can do with them anyway. Also, being JMP_SLOT type,
> >    they have implicit addends of zero if they're REL-type, so there's
> >    no need to save addends.
> > 
> > 2. Remove the loop in dlstart.c that works like a fake function call 3
> >    times to process DT_JMPREL, DT_REL, and DT_RELA. Instead we just
> >    need 2 iterations, and now the stride is constant in each, so they
> >    should simplify down a lot more inline.
> > 
> > 3. During the loop that processes DT_REL, count the number of
> >    non-relative relocations (ones we skip at this stage), then make a
> >    VLA this size and pass its address to __dls2 as a second argument.
> > 
> > 4. Have the do_relocs in stage 2 save addends in this provided array
> >    before overwriting them, and save its address for use by stage 3.
> > 
> > 5. Have the do_relocs in stage 3 (for ldso/libc only) pull addends
> >    from this array instead of of from inline.
> > 
> > Steps 1 and 2 are purely code removal/simplification and should be
> > done regardless of whether we move forward on the above program, I
> > think. Steps 3-5 add some complexity but hardly any code, just a few
> > lines here and there.
> > 
> > Comments?
> 
> I like it.
> 
> The thing that is a bit critical here, is the VLA. Not because it is a
> VLA as such, but because it is a dynamic allocation on the stack. We
> already have a similar strategy in pthread_create for TLS. The
> difference is that there we have
> 
>  - a sanity check
>  - an alternative strategy if the sanity check fails
> 
> Would there be a possibility to have both here, too?

I'm not sure what you're referring to in pthread_create. It has no
VLA. Maybe you mean the choice of whether to put TLS in a
caller-provided thread stack, but I don't think that's an analogous
situaton. In that one, the provided stack size is known and TLS is
arbitrarily large (under the control of the app and loaded libs). Here
(dlstart), the stack size is not known and the size of the addend
table is not specifically known, but it's fixed at ld-time for libc.so
and the same for every run, and it's orders of magnitude smaller than
any usable stack (e.g. 14 entries on i386).

There are two potential failure modes here:

1. We run out of stack because RLIMIT_STACK is insanely small. In that
   case we'll quickly crash somewhere else. If there's a risk of
   getting off the stack into other memory, that risk would already
   exist in other places.

2. We run out of stack because the REL table is HUGE. This is a static
   condition for the libc.so ELF file and would not change from run to
   run. If something went wrong in the build process to cause this, it
   needs to be fixed.

So I'm skeptical of the need for a fallback.

If we do want/need a fallback, it would need to be of the following
form:

1. Count addends that need to be saved.

2. If cnt<LIMIT (e.g. 16k), produce VLA of the right size; otherwise
   produce VLA of size 1 and pass a null pointer instead of a pointer
   to the VLA.

3. In stage 2, if the addend-buffer pointer is null, allocate storage
   for it with __syscall(SYS_mmap, ...) (we can't call mmap yet).
   Abort on failure.

4. In stage 3, if the addend-buffer was allocated donate it to malloc
   after we're done using it.

I would probably prefer just having it crash/abort in step 3, since
this condition obviously indicates a broken build. It's better to
catch such an issue and fix it than to have a non-robust libc.so that
breaks fail-safety for no-third-party-libs binaries by depending on
allocation.

Rich