From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10220 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: musl ldd: swt build: Error relocating / symbol not found Date: Fri, 24 Jun 2016 13:03:47 -0400 Message-ID: <20160624170347.GB10893@brightrain.aerifal.cx> References: <576B58E6.6040400@gmail.com> <20160623042448.GX10893@brightrain.aerifal.cx> <576C02C2.7070006@gmail.com> <20160623171008.GY10893@brightrain.aerifal.cx> <576C3BB4.3010307@gmail.com> <20160623231527.GZ22574@port70.net> <576D5E9B.4070608@gmail.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1466787857 28515 80.91.229.3 (24 Jun 2016 17:04:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 24 Jun 2016 17:04:17 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-10233-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jun 24 19:04:05 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1bGUWS-0007NJ-Ta for gllmg-musl@m.gmane.org; Fri, 24 Jun 2016 19:04:05 +0200 Original-Received: (qmail 5880 invoked by uid 550); 24 Jun 2016 17:04:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 5847 invoked from network); 24 Jun 2016 17:04:01 -0000 Content-Disposition: inline In-Reply-To: <576D5E9B.4070608@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:10220 Archived-At: On Fri, Jun 24, 2016 at 04:23:55PM +0000, Andrei Pozolotin wrote: > Szabolcs: > > On 06/23/2016 11:15 PM, Szabolcs Nagy wrote: > > * Andrei Pozolotin [2016-06-23 19:42:44 +0000]: > >> b) while at the same time musl ldd reporting that library dependency > >> tree is resolved with no error: > >> > >> lddtree /usr/lib/libswt-atk-gtk-4530.so > > that's not musl's ldd, but scanelf from pax-utils > thank you for pointing out. > > when debugging such a complicated setup the output > > of tools that may use subtly different library paths > > and symbol resolution logic is not very helpful. > ok, got it. > > ldd /usr/lib/libswt-gtk-4530.so > ldd /usr/lib/libswt-gtk-4530.so > ldd (0x55e333e6c000) > libc.musl-x86_64.so.1 => ldd (0x55e333e6c000) > > ldd /usr/lib/libswt-atk-gtk-4530.so > ldd /usr/lib/libswt-atk-gtk-4530.so > ldd (0x55edc6edc000) > libatk-1.0.so.0 => /usr/lib/libatk-1.0.so.0 (0x7fc763298000) > libc.musl-x86_64.so.1 => ldd (0x55edc6edc000) > libgobject-2.0.so.0 => /usr/lib/libgobject-2.0.so.0 (0x7fc763058000) > libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0x7fc762d6d000) > libintl.so.8 => /usr/lib/libintl.so.8 (0x7fc762b5f000) > libffi.so.6 => /usr/lib/libffi.so.6 (0x7fc762957000) > libpcre.so.1 => /usr/lib/libpcre.so.1 (0x7fc7626fe000) > > would be more interesting.. > > > > but even then we don't know what's going on > > (if libswt-gtk-4530.so is dlopened with RTLD_LOCAL > > then its libgobject dependency might not be visible > > to libswt-atk-gtk-4530) > OK. here is the story: > > * java native interface: NativeLibrary.load() > http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/74e5fc94c77b/src/share/classes/java/lang/ClassLoader.java#l1726 > > * java JNI implementation: > Java_java_lang_ClassLoader_00024NativeLibrary_load > http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/74e5fc94c77b/src/share/native/java/lang/ClassLoader.c#l369 > > * libjvm.so entry point: os::dll_load > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/tip/src/share/vm/prims/jvm.cpp#l3959 > > * libjvm.so linux implementation > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/share/vm/runtime/os.hpp#l564 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1773 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1767 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1997 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1988 > > * and finally: it says: dlopen RTLD_LAZY: > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1988 > void * result = ::dlopen(filename, RTLD_LAZY); > > http://linux.die.net/man/3/dlopen > RTLD_LAZY: Perform lazy binding. Only resolve symbols as the code that > references them is executed. > If the symbol is never referenced, then it is never resolved. > (Lazy binding is only performed for function references; > references to variables are always immediately bound when the library is > loaded.) > > RTLD_LAZY is good, right? :-) OK, this is likely the root of the problem: invalid code assuming that it can load libraries with undefined symbols as long as it doesn't try to use those code paths. The man page you linked to is rather poor-quality. When symbol binding takes place with RTLD_LAZY is actually implementation-defined and can be anywhere between the time of dlopen and the time of use. The flag should be treated only as a hint for allowing performance optimizations, not as something that gives the caller permission to do erroneous things. Aside from formal correctness, there are multiple reasons for this. It's architecture- and linktime-option-dependent whether late binding is even possible at all, and musl purposefully does not implement lazy binding because it's a huge surface for bugs (which you can see by looking at glibc's history of bugs caused by lazy binding). There's one other well-known piece of software, x.org, abusing RTLD_LAZY in the same way, and we have discussed possible workarounds before. It would be possible to accept relocations with undefined symbol references at dlopen time by storing a list of them, and rather than lazily processing them at call time, re-process them after each additional dlopen. This would allow broken programs to work without introducing the bug surface that actual lazy-binding introduces. However it's a fairly big task to add, and it would be much nicer just to get the buggy programs fixed (there are already reasonable workarounds for x.org). Rich