Am Montag, den 25.05.2015, 02:57 -0400 schrieb Rich Felker: > On Mon, May 25, 2015 at 08:31:29AM +0200, Jens Gustedt wrote: > > Am Sonntag, den 24.05.2015, 20:36 -0400 schrieb Rich Felker: > > > There's a simple alternative I just came up with though: have > > > dlstart.c compute the number of REL entries that need their addends > > > saved and allocate a VLA on its stack for stages 2 and 3 to use. While > > > the number of addends could be significant, it's many orders of > > > magnitude smaller than the smallest practical stack sizes we could > > > actually run with, so it's perfectly safe to put it on the stack. > > > > > > Here're the basic changes I'd like to make to dlstart.c to implement > > > this: > > > > > > 1. Remove processing of DT_JMPREL REL/RELA table. All entries in this > > > table are necessarily JMP_SLOT type, and thus symbolic, so there's > > > nothing stage 1 can do with them anyway. Also, being JMP_SLOT type, > > > they have implicit addends of zero if they're REL-type, so there's > > > no need to save addends. > > > > > > 2. Remove the loop in dlstart.c that works like a fake function call 3 > > > times to process DT_JMPREL, DT_REL, and DT_RELA. Instead we just > > > need 2 iterations, and now the stride is constant in each, so they > > > should simplify down a lot more inline. > > > > > > 3. During the loop that processes DT_REL, count the number of > > > non-relative relocations (ones we skip at this stage), then make a > > > VLA this size and pass its address to __dls2 as a second argument. > > > > > > 4. Have the do_relocs in stage 2 save addends in this provided array > > > before overwriting them, and save its address for use by stage 3. > > > > > > 5. Have the do_relocs in stage 3 (for ldso/libc only) pull addends > > > from this array instead of of from inline. > > > > > > Steps 1 and 2 are purely code removal/simplification and should be > > > done regardless of whether we move forward on the above program, I > > > think. Steps 3-5 add some complexity but hardly any code, just a few > > > lines here and there. > > > > > > Comments? > > > > I like it. > > > > The thing that is a bit critical here, is the VLA. Not because it is a > > VLA as such, but because it is a dynamic allocation on the stack. We > > already have a similar strategy in pthread_create for TLS. The > > difference is that there we have > > > > - a sanity check > > - an alternative strategy if the sanity check fails > > > > Would there be a possibility to have both here, too? > > I'm not sure what you're referring to in pthread_create. It has no > VLA. Maybe you mean the choice of whether to put TLS in a > caller-provided thread stack, that's what I meant > but I don't think that's an analogous > situaton. In that one, the provided stack size is known and TLS is > arbitrarily large (under the control of the app and loaded libs). Here > (dlstart), the stack size is not known and the size of the addend > table is not specifically known, but it's fixed at ld-time for libc.so > and the same for every run, and it's orders of magnitude smaller than > any usable stack (e.g. 14 entries on i386). > There are two potential failure modes here: > > 1. We run out of stack because RLIMIT_STACK is insanely small. In that > case we'll quickly crash somewhere else. If there's a risk of > getting off the stack into other memory, that risk would already > exist in other places. > > 2. We run out of stack because the REL table is HUGE. This is a static > condition for the libc.so ELF file and would not change from run to > run. If something went wrong in the build process to cause this, it > needs to be fixed. > > So I'm skeptical of the need for a fallback. > > If we do want/need a fallback, it would need to be of the following > form: > > 1. Count addends that need to be saved. > > 2. If cnt produce VLA of size 1 and pass a null pointer instead of a pointer > to the VLA. Ok, so we have a simple sanity check, great. > 3. In stage 2, if the addend-buffer pointer is null, allocate storage > for it with __syscall(SYS_mmap, ...) (we can't call mmap yet). > Abort on failure. > > 4. In stage 3, if the addend-buffer was allocated donate it to malloc > after we're done using it. > > I would probably prefer just having it crash/abort in step 3, me too > since > this condition obviously indicates a broken build. It's better to > catch such an issue and fix it than to have a non-robust libc.so that > breaks fail-safety for no-third-party-libs binaries by depending on > allocation. Observe that I said "alternative strategy" not "fallback". "Crash early" to be that strategy sounds completely ok to me. Thanks Jens -- :: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS ::: :: ::::::::::::::: office Strasbourg : +33 368854536 :: :: :::::::::::::::::::::: gsm France : +33 651400183 :: :: ::::::::::::::: gsm international : +49 15737185122 :: :: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::