* Further dynamic linker optimizations @ 2015-06-30 20:04 Rich Felker 2015-07-01 5:41 ` Timo Teras 2015-07-07 18:39 ` Alexander Monakov 0 siblings, 2 replies; 12+ messages in thread From: Rich Felker @ 2015-06-30 20:04 UTC (permalink / raw) To: musl Discussion on #musl with Timo Teräs has produced the following results: - Moving bloom filter size to struct dso gives 5% improvement in clang (built as 110 .so's) start time, simply because of a reduction of number of instructions in the hot path. So I think we should apply that patch. - The whole outer for loop in find_sym is the hot path for performance. As such, eliminating the lazy calculation of gnu_hash and simply doing it before the loop should be a measurable win, just by removing the if (!ghm) branch. - Even the check if (!dso->global) continue; has nontrivial cost. Since I want to replace this representation with a separate linked-list chain for global dsos anyway (for other reasons) I think that's worth prioritizing for performance too. - We still don't save and reuse the last symbol lookup in do_relocs. Doing so could improve performance a lot when the same symbol is referenced multiple times from global data. When the only references are the GOT (thus only one per symbol), it's not going to help, but since it's outside the find_sym dso loop, it should not have measurable cost anyway. - String comparison (dl_strcmp) is costly, but nontrivial to optimize. Word-at-a-time optimizations have issues with crossing pages, even on archs that don't require aligned access. Probably the right way forward here is to get an optimized general strcmp, then add a mechanism (function pointer in struct dso? or global?) for the dynamic linker to call dl_strcmp when relocating itself but the real strcmp later. - The strength-reduction of remainder operations does not seem to provide worthwhile benefits yet, simply because so little of the overall time is spent on the division/remainder. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Further dynamic linker optimizations 2015-06-30 20:04 Further dynamic linker optimizations Rich Felker @ 2015-07-01 5:41 ` Timo Teras 2015-07-01 14:03 ` Rich Felker 2015-07-07 18:39 ` Alexander Monakov 1 sibling, 1 reply; 12+ messages in thread From: Timo Teras @ 2015-07-01 5:41 UTC (permalink / raw) To: Rich Felker; +Cc: musl On Tue, 30 Jun 2015 16:04:54 -0400 Rich Felker <dalias@libc.org> wrote: > Discussion on #musl with Timo Teräs has produced the following > results: Nice summary. Thanks! > - The whole outer for loop in find_sym is the hot path for > performance. As such, eliminating the lazy calculation of gnu_hash > and simply doing it before the loop should be a measurable win, just > by removing the if (!ghm) branch. Additional thought. We could do a skip list here. If we calculate the gnu-hash unconditionally, we could bloom filter bits to construct a skip list. That is, we have next_symlookup[] array that has pointer for each wordsize bits (or potentially a small multiple of it). And we would link each dso in next_symlookup array corresponding to each bloom filter bit (for dso without gnu-hash it'd have to go to all of them). Then on lookup we could just use the calculated bloomfilter to follow the correct symlookup chain next pointers. If the pointer array size is less than the bloom filter size, the bloom filter can be always reduced by |= individual elements together. Though, it'd probably need some analysis on how this would work out. If ORring all elements together always yields all bits set, this is kinda useless. This should be significant win on cases like clang when there are tens of thousands of symbol lookups, and 100+ dsos. Trade off is of course little memory and little extra time to setup the additional chains. Thoughts? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Further dynamic linker optimizations 2015-07-01 5:41 ` Timo Teras @ 2015-07-01 14:03 ` Rich Felker 2015-07-01 14:10 ` Timo Teras 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2015-07-01 14:03 UTC (permalink / raw) To: musl On Wed, Jul 01, 2015 at 08:41:29AM +0300, Timo Teras wrote: > On Tue, 30 Jun 2015 16:04:54 -0400 > Rich Felker <dalias@libc.org> wrote: > > > Discussion on #musl with Timo Teräs has produced the following > > results: > > Nice summary. Thanks! > > > - The whole outer for loop in find_sym is the hot path for > > performance. As such, eliminating the lazy calculation of gnu_hash > > and simply doing it before the loop should be a measurable win, just > > by removing the if (!ghm) branch. > > Additional thought. We could do a skip list here. If we calculate the > gnu-hash unconditionally, we could bloom filter bits to construct a > skip list. This wasn't something I recall discussing... > That is, we have next_symlookup[] array that has pointer for each > wordsize bits (or potentially a small multiple of it). And we would link > each dso in next_symlookup array corresponding to each bloom filter > bit (for dso without gnu-hash it'd have to go to all of them). Then on > lookup we could just use the calculated bloomfilter to follow the > correct symlookup chain next pointers. This is a very large size increase, and perhaps notable startup time increase, just for the sake of mislinked (clang) applications. It's something like the idea I wanted to do in a static linker, albeit with a much larger global table for where to start based on hash%largeval rather than local next/skip tables per module. But I don't think it's appropriate for dynamic linking. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Further dynamic linker optimizations 2015-07-01 14:03 ` Rich Felker @ 2015-07-01 14:10 ` Timo Teras 0 siblings, 0 replies; 12+ messages in thread From: Timo Teras @ 2015-07-01 14:10 UTC (permalink / raw) To: Rich Felker; +Cc: musl On Wed, 1 Jul 2015 10:03:27 -0400 Rich Felker <dalias@libc.org> wrote: > On Wed, Jul 01, 2015 at 08:41:29AM +0300, Timo Teras wrote: > > On Tue, 30 Jun 2015 16:04:54 -0400 > > Rich Felker <dalias@libc.org> wrote: > > > > > Discussion on #musl with Timo Teräs has produced the following > > > results: > > > > Nice summary. Thanks! > > > > > - The whole outer for loop in find_sym is the hot path for > > > performance. As such, eliminating the lazy calculation of > > > gnu_hash and simply doing it before the loop should be a > > > measurable win, just by removing the if (!ghm) branch. > > > > Additional thought. We could do a skip list here. If we calculate > > the gnu-hash unconditionally, we could bloom filter bits to > > construct a skip list. > > This wasn't something I recall discussing... Yes. That's why I said "additional thought". :) > > That is, we have next_symlookup[] array that has pointer for each > > wordsize bits (or potentially a small multiple of it). And we would > > link each dso in next_symlookup array corresponding to each bloom > > filter bit (for dso without gnu-hash it'd have to go to all of > > them). Then on lookup we could just use the calculated bloomfilter > > to follow the correct symlookup chain next pointers. > > This is a very large size increase, and perhaps notable startup time > increase, just for the sake of mislinked (clang) applications. It's > something like the idea I wanted to do in a static linker, albeit with > a much larger global table for where to start based on hash%largeval > rather than local next/skip tables per module. But I don't think it's > appropriate for dynamic linking. Yes. And after some trivial benchmarking, it seems to not give any significant improvement. bloomfilter using only one machine word is not enough for anything. And doing anything larger gives too much memory use overhead. Just moving the gnu-hash calculation out of the loop and doing it always + removing the ->global check will give already quite noticeable boost. Thanks. Timo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Further dynamic linker optimizations 2015-06-30 20:04 Further dynamic linker optimizations Rich Felker 2015-07-01 5:41 ` Timo Teras @ 2015-07-07 18:39 ` Alexander Monakov 2015-07-07 18:55 ` Rich Felker 1 sibling, 1 reply; 12+ messages in thread From: Alexander Monakov @ 2015-07-07 18:39 UTC (permalink / raw) To: musl [-- Attachment #1: Type: TEXT/PLAIN, Size: 1779 bytes --] On Tue, 30 Jun 2015, Rich Felker wrote: > Discussion on #musl with Timo Teräs has produced the following > results: > > - Moving bloom filter size to struct dso gives 5% improvement in clang > (built as 110 .so's) start time, simply because of a reduction of > number of instructions in the hot path. So I think we should apply > that patch. I think most of the improvement here actually comes from fewer cache misses. As a result, I think we should take this idea further and shuffle struct dso a little bit so that fields accessed in the hot find_sym loop are packed together, if possible. > - The whole outer for loop in find_sym is the hot path for > performance. As such, eliminating the lazy calculation of gnu_hash > and simply doing it before the loop should be a measurable win, just > by removing the if (!ghm) branch. On a related note, it's possible to avoid calculating sysv hash, if gnu-hash is enabled system-wide, by not setting 'global' flag on the vdso item (as mentioned on IRC in your conversation with Timo). > - Even the check if (!dso->global) continue; has nontrivial cost. > Since I want to replace this representation with a separate > linked-list chain for global dsos anyway (for other reasons) I think > that's worth prioritizing for performance too. I'm curious what the other reasons are? :) > - The strength-reduction of remainder operations does not seem to > provide worthwhile benefits yet, simply because so little of the > overall time is spent on the division/remainder. On IRC we noted that on AArch64 it's slower than native div/mod on our microbenchmark, and on ARM the speedup is smaller than expected. My testing on x86 indicates that it's not profitable in the dynamic linker (not sure why). Alexander ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Further dynamic linker optimizations 2015-07-07 18:39 ` Alexander Monakov @ 2015-07-07 18:55 ` Rich Felker 2015-07-08 5:48 ` Timo Teras 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2015-07-07 18:55 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 3060 bytes --] On Tue, Jul 07, 2015 at 09:39:09PM +0300, Alexander Monakov wrote: > On Tue, 30 Jun 2015, Rich Felker wrote: > > > Discussion on #musl with Timo Teräs has produced the following > > results: > > > > - Moving bloom filter size to struct dso gives 5% improvement in clang > > (built as 110 .so's) start time, simply because of a reduction of > > number of instructions in the hot path. So I think we should apply > > that patch. > > I think most of the improvement here actually comes from fewer cache misses. > As a result, I think we should take this idea further and shuffle struct dso a > little bit so that fields accessed in the hot find_sym loop are packed > together, if possible. I'm not entirely convinced; the 5% seems consistent with the number of instructions in the code path. Can you confirm this with cache miss measurements? Or just by obtaining better timings reordering data for cache locality? Note that the head of struct dso has to remain fixed (it's gdb ABI :/) but the rest is free to change. > > - The whole outer for loop in find_sym is the hot path for > > performance. As such, eliminating the lazy calculation of gnu_hash > > and simply doing it before the loop should be a measurable win, just > > by removing the if (!ghm) branch. > > On a related note, it's possible to avoid calculating sysv hash, if gnu-hash > is enabled system-wide, by not setting 'global' flag on the vdso item (as > mentioned on IRC in your conversation with Timo). Yes, and I think this sounds like a worthwhile approach. Seeing timings for it would be great. :-) > > - Even the check if (!dso->global) continue; has nontrivial cost. > > Since I want to replace this representation with a separate > > linked-list chain for global dsos anyway (for other reasons) I think > > that's worth prioritizing for performance too. > > I'm curious what the other reasons are? :) Depending on an open question I have to the Austin Group list (sorry, I can't get the archives to work to provide a link), changes may be needed for semantic correctness. It's easier to describe the issue with code. Compile the attached test case with the following commands: gcc -shared -fPIC -DLIB -o libA.so dlorder.c gcc -shared -fPIC -DLIB -o libB.so dlorder.c gcc -o dlorder dlorder.c On musl it prints 2 different addresses (the subsequent RTLD_GLOBAL changes the definition of a symbol) which I think is wrong, but I haven't yet checked what other implementations do. > > - The strength-reduction of remainder operations does not seem to > > provide worthwhile benefits yet, simply because so little of the > > overall time is spent on the division/remainder. > > On IRC we noted that on AArch64 it's slower than native div/mod on our > microbenchmark, and on ARM the speedup is smaller than expected. My testing > on x86 indicates that it's not profitable in the dynamic linker (not sure > why). Agreed, but I think we do know why it's not profitable: at least in the cases tested, the time spent on remainders is negligible anyway. Rich [-- Attachment #2: dlorder.c --] [-- Type: text/plain, Size: 367 bytes --] #ifdef LIB int foo = 42; #else #include <dlfcn.h> #include <stdio.h> int main() { void *h1, *h2, *hg; h1 = dlopen("./libA.so", RTLD_NOW|RTLD_LOCAL); h2 = dlopen("./libB.so", RTLD_NOW|RTLD_GLOBAL); hg = dlopen(0, RTLD_NOW|RTLD_GLOBAL); printf("%p\n", dlsym(hg, "foo")); dlopen("./libA.so", RTLD_NOW|RTLD_GLOBAL); printf("%p\n", dlsym(hg, "foo")); } #endif ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Further dynamic linker optimizations 2015-07-07 18:55 ` Rich Felker @ 2015-07-08 5:48 ` Timo Teras 2015-08-05 22:37 ` Andy Lutomirski 0 siblings, 1 reply; 12+ messages in thread From: Timo Teras @ 2015-07-08 5:48 UTC (permalink / raw) To: Rich Felker; +Cc: musl On Tue, 7 Jul 2015 14:55:05 -0400 Rich Felker <dalias@libc.org> wrote: > On Tue, Jul 07, 2015 at 09:39:09PM +0300, Alexander Monakov wrote: > > On Tue, 30 Jun 2015, Rich Felker wrote: > > > > > Discussion on #musl with Timo Teräs has produced the following > > > results: > > > > > > - Moving bloom filter size to struct dso gives 5% improvement in > > > clang (built as 110 .so's) start time, simply because of a > > > reduction of number of instructions in the hot path. So I think > > > we should apply that patch. > > > > I think most of the improvement here actually comes from fewer > > cache misses. As a result, I think we should take this idea further > > and shuffle struct dso a little bit so that fields accessed in the > > hot find_sym loop are packed together, if possible. > > I'm not entirely convinced; the 5% seems consistent with the number of > instructions in the code path. Can you confirm this with cache miss > measurements? Or just by obtaining better timings reordering data for > cache locality? Note that the head of struct dso has to remain fixed > (it's gdb ABI :/) but the rest is free to change. I used cachegrind and callgrind to benchmark. In my case there was no change in cache miss number - the speed up was purely based on running less instructions on the hot path. Though, I ran this on i7 with lot of cache. Cache misses could become issue on smaller cpus. But I suspect the bloom filter is doing good enough job to keep cache usage on sensible levels. > > > - The whole outer for loop in find_sym is the hot path for > > > performance. As such, eliminating the lazy calculation of > > > gnu_hash and simply doing it before the loop should be a > > > measurable win, just by removing the if (!ghm) branch. > > > > On a related note, it's possible to avoid calculating sysv hash, if > > gnu-hash is enabled system-wide, by not setting 'global' flag on > > the vdso item (as mentioned on IRC in your conversation with Timo). > > Yes, and I think this sounds like a worthwhile approach. Seeing > timings for it would be great. :-) I told them earlier in IRC. But on the same i7 box and running "clang --version" which has 100+ DT_NEEDED... removing vdso and thus sysv hashing had magnitude of tens of milliseconds. (I wonder how it'd perform if we calculated both sysv and gnu hashes at same time.) Removing the 'global' flag testing, and making gnu-hash calculation unconditional together were also a measurable speed-up. Around 5-10 milliseconds. For reference, "time clang --version" on my Intel(R) Core(TM) i7-4510U: - current musl release: ~160 ms - current git master: ~90 ms - ghashmask added: ~83 ms - sysv hash calc removed: ~77 ms - global test removed, unconditional gnu-hash: ~71 ms As another reference, "clang --version" currently takes about 3 seconds on Wandboard ARM box. But I have no numbers on the speed up on that box. Thanks, Timo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Further dynamic linker optimizations 2015-07-08 5:48 ` Timo Teras @ 2015-08-05 22:37 ` Andy Lutomirski 2015-08-06 3:04 ` Rich Felker 2015-08-06 4:32 ` Isaac Dunham 0 siblings, 2 replies; 12+ messages in thread From: Andy Lutomirski @ 2015-08-05 22:37 UTC (permalink / raw) To: musl, Rich Felker On 07/07/2015 10:48 PM, Timo Teras wrote: > On Tue, 7 Jul 2015 14:55:05 -0400 > Rich Felker <dalias@libc.org> wrote: > >> On Tue, Jul 07, 2015 at 09:39:09PM +0300, Alexander Monakov wrote: >>> On Tue, 30 Jun 2015, Rich Felker wrote: >>> >>>> Discussion on #musl with Timo Teräs has produced the following >>>> results: >>>> >>>> - Moving bloom filter size to struct dso gives 5% improvement in >>>> clang (built as 110 .so's) start time, simply because of a >>>> reduction of number of instructions in the hot path. So I think >>>> we should apply that patch. >>> >>> I think most of the improvement here actually comes from fewer >>> cache misses. As a result, I think we should take this idea further >>> and shuffle struct dso a little bit so that fields accessed in the >>> hot find_sym loop are packed together, if possible. >> >> I'm not entirely convinced; the 5% seems consistent with the number of >> instructions in the code path. Can you confirm this with cache miss >> measurements? Or just by obtaining better timings reordering data for >> cache locality? Note that the head of struct dso has to remain fixed >> (it's gdb ABI :/) but the rest is free to change. > > I used cachegrind and callgrind to benchmark. In my case there was no > change in cache miss number - the speed up was purely based on running > less instructions on the hot path. > > Though, I ran this on i7 with lot of cache. Cache misses could become > issue on smaller cpus. But I suspect the bloom filter is doing good > enough job to keep cache usage on sensible levels. > >>>> - The whole outer for loop in find_sym is the hot path for >>>> performance. As such, eliminating the lazy calculation of >>>> gnu_hash and simply doing it before the loop should be a >>>> measurable win, just by removing the if (!ghm) branch. >>> >>> On a related note, it's possible to avoid calculating sysv hash, if >>> gnu-hash is enabled system-wide, by not setting 'global' flag on >>> the vdso item (as mentioned on IRC in your conversation with Timo). >> >> Yes, and I think this sounds like a worthwhile approach. Seeing >> timings for it would be great. :-) > > I told them earlier in IRC. But on the same i7 box and running "clang > --version" which has 100+ DT_NEEDED... removing vdso and thus sysv > hashing had magnitude of tens of milliseconds. (I wonder how it'd > perform if we calculated both sysv and gnu hashes at same time.) /me dons vdso maintainer hat. I can add a GNU hash to the vdso quite easily (for Linux 4.3). Would that be helpful? --Andy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: Further dynamic linker optimizations 2015-08-05 22:37 ` Andy Lutomirski @ 2015-08-06 3:04 ` Rich Felker 2015-08-06 4:32 ` Isaac Dunham 1 sibling, 0 replies; 12+ messages in thread From: Rich Felker @ 2015-08-06 3:04 UTC (permalink / raw) To: Andy Lutomirski; +Cc: musl On Wed, Aug 05, 2015 at 03:37:25PM -0700, Andy Lutomirski wrote: > >>>>- The whole outer for loop in find_sym is the hot path for > >>>> performance. As such, eliminating the lazy calculation of > >>>>gnu_hash and simply doing it before the loop should be a > >>>>measurable win, just by removing the if (!ghm) branch. > >>> > >>>On a related note, it's possible to avoid calculating sysv hash, if > >>>gnu-hash is enabled system-wide, by not setting 'global' flag on > >>>the vdso item (as mentioned on IRC in your conversation with Timo). > >> > >>Yes, and I think this sounds like a worthwhile approach. Seeing > >>timings for it would be great. :-) > > > >I told them earlier in IRC. But on the same i7 box and running "clang > >--version" which has 100+ DT_NEEDED... removing vdso and thus sysv > >hashing had magnitude of tens of milliseconds. (I wonder how it'd > >perform if we calculated both sysv and gnu hashes at same time.) > > /me dons vdso maintainer hat. > > I can add a GNU hash to the vdso quite easily (for Linux 4.3). > Would that be helpful? Yes, and I'd lean towards doing this unless you can see any disadvantages to weigh it against (using more pages? would that matter?). But either way I think we should make the change on the musl side too. It doesn't make sense for the vdso to appear in the global namespace unless it was actually pulled in by dlopen/RTLD_GLOBAL. For actually using the vdso symbols, we don't use the dynamic linker anyway; we look them up directly so that they work with static linking (and because the way the dynamic linker/libc is linked precludes vdso symbols getting used to resolve its own references, anyway). Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: Further dynamic linker optimizations 2015-08-05 22:37 ` Andy Lutomirski 2015-08-06 3:04 ` Rich Felker @ 2015-08-06 4:32 ` Isaac Dunham 2015-08-06 9:33 ` Szabolcs Nagy 1 sibling, 1 reply; 12+ messages in thread From: Isaac Dunham @ 2015-08-06 4:32 UTC (permalink / raw) To: musl; +Cc: Rich Felker On Wed, Aug 05, 2015 at 03:37:25PM -0700, Andy Lutomirski wrote: > On 07/07/2015 10:48 PM, Timo Teras wrote: > >On Tue, 7 Jul 2015 14:55:05 -0400 > >Rich Felker <dalias@libc.org> wrote: > > > >>On Tue, Jul 07, 2015 at 09:39:09PM +0300, Alexander Monakov wrote: > >>>On Tue, 30 Jun 2015, Rich Felker wrote: > >>> > >>>>Discussion on #musl with Timo Ter??s has produced the following > >>>>results: > >>>> > >>>>- Moving bloom filter size to struct dso gives 5% improvement in > >>>>clang (built as 110 .so's) start time, simply because of a > >>>>reduction of number of instructions in the hot path. So I think > >>>>we should apply that patch. > >>> > >>>I think most of the improvement here actually comes from fewer > >>>cache misses. As a result, I think we should take this idea further > >>>and shuffle struct dso a little bit so that fields accessed in the > >>>hot find_sym loop are packed together, if possible. > >> > >>I'm not entirely convinced; the 5% seems consistent with the number of > >>instructions in the code path. Can you confirm this with cache miss > >>measurements? Or just by obtaining better timings reordering data for > >>cache locality? Note that the head of struct dso has to remain fixed > >>(it's gdb ABI :/) but the rest is free to change. > > > >I used cachegrind and callgrind to benchmark. In my case there was no > >change in cache miss number - the speed up was purely based on running > >less instructions on the hot path. > > > >Though, I ran this on i7 with lot of cache. Cache misses could become > >issue on smaller cpus. But I suspect the bloom filter is doing good > >enough job to keep cache usage on sensible levels. > > > >>>>- The whole outer for loop in find_sym is the hot path for > >>>> performance. As such, eliminating the lazy calculation of > >>>>gnu_hash and simply doing it before the loop should be a > >>>>measurable win, just by removing the if (!ghm) branch. > >>> > >>>On a related note, it's possible to avoid calculating sysv hash, if > >>>gnu-hash is enabled system-wide, by not setting 'global' flag on > >>>the vdso item (as mentioned on IRC in your conversation with Timo). > >> > >>Yes, and I think this sounds like a worthwhile approach. Seeing > >>timings for it would be great. :-) > > > >I told them earlier in IRC. But on the same i7 box and running "clang > >--version" which has 100+ DT_NEEDED... removing vdso and thus sysv > >hashing had magnitude of tens of milliseconds. (I wonder how it'd > >perform if we calculated both sysv and gnu hashes at same time.) > > /me dons vdso maintainer hat. > > I can add a GNU hash to the vdso quite easily (for Linux 4.3). Would that > be helpful? Would this require a binutils version that supports GNU hashes? And if so, would it be a hard build-time requirement? Thanks, Isaac Dunham ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: Further dynamic linker optimizations 2015-08-06 4:32 ` Isaac Dunham @ 2015-08-06 9:33 ` Szabolcs Nagy 2015-08-06 15:13 ` Andy Lutomirski 0 siblings, 1 reply; 12+ messages in thread From: Szabolcs Nagy @ 2015-08-06 9:33 UTC (permalink / raw) To: Isaac Dunham; +Cc: musl, Rich Felker, Andy Lutomirski * Isaac Dunham <ibid.ag@gmail.com> [2015-08-05 21:32:53 -0700]: > On Wed, Aug 05, 2015 at 03:37:25PM -0700, Andy Lutomirski wrote: > > > > I can add a GNU hash to the vdso quite easily (for Linux 4.3). Would that > > be helpful? > > Would this require a binutils version that supports GNU hashes? > And if so, would it be a hard build-time requirement? > vdso is only used at runtime, so static linker support is not needed when you build applications. i guess for building the kernel itself linking the vdso.so will depend on --hash-style=gnu support in the target ld, that is binutils 2.18. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Re: Further dynamic linker optimizations 2015-08-06 9:33 ` Szabolcs Nagy @ 2015-08-06 15:13 ` Andy Lutomirski 0 siblings, 0 replies; 12+ messages in thread From: Andy Lutomirski @ 2015-08-06 15:13 UTC (permalink / raw) To: Isaac Dunham, musl, Rich Felker, Andy Lutomirski On Thu, Aug 6, 2015 at 2:33 AM, Szabolcs Nagy <nsz@port70.net> wrote: > * Isaac Dunham <ibid.ag@gmail.com> [2015-08-05 21:32:53 -0700]: >> On Wed, Aug 05, 2015 at 03:37:25PM -0700, Andy Lutomirski wrote: >> > >> > I can add a GNU hash to the vdso quite easily (for Linux 4.3). Would that >> > be helpful? >> >> Would this require a binutils version that supports GNU hashes? >> And if so, would it be a hard build-time requirement? >> > > vdso is only used at runtime, so static linker support is not > needed when you build applications. > > i guess for building the kernel itself linking the vdso.so > will depend on --hash-style=gnu support in the target ld, > that is binutils 2.18. Yes, exactly. I'll do this for x86, and I'll encourage the other arch vdso maintainers to do the same thing. If a kernel is built with old binutils, then the gnu has won't be there. --Andy ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-08-06 15:13 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-06-30 20:04 Further dynamic linker optimizations Rich Felker 2015-07-01 5:41 ` Timo Teras 2015-07-01 14:03 ` Rich Felker 2015-07-01 14:10 ` Timo Teras 2015-07-07 18:39 ` Alexander Monakov 2015-07-07 18:55 ` Rich Felker 2015-07-08 5:48 ` Timo Teras 2015-08-05 22:37 ` Andy Lutomirski 2015-08-06 3:04 ` Rich Felker 2015-08-06 4:32 ` Isaac Dunham 2015-08-06 9:33 ` Szabolcs Nagy 2015-08-06 15:13 ` Andy Lutomirski
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).