From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14542 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.linux.lib.musl.general Subject: Re: dlsym returning unresolved symbol address instead of dependency library symbol address Date: Sat, 10 Aug 2019 19:27:49 +0200 Message-ID: <20190810172749.GJ22009@port70.net> References: <20190810101111.GH22009@port70.net> <20190810164252.GL9017@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="199343"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.10.1 (2018-07-13) Cc: Luiz Angelo Daros de Luca To: musl@lists.openwall.com Original-X-From: musl-return-14558-gllmg-musl=m.gmane.org@lists.openwall.com Sat Aug 10 19:28:05 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hwV9w-000pjP-9m for gllmg-musl@m.gmane.org; Sat, 10 Aug 2019 19:28:04 +0200 Original-Received: (qmail 3708 invoked by uid 550); 10 Aug 2019 17:28:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 3689 invoked from network); 10 Aug 2019 17:28:01 -0000 Mail-Followup-To: musl@lists.openwall.com, Luiz Angelo Daros de Luca Content-Disposition: inline In-Reply-To: <20190810164252.GL9017@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:14542 Archived-At: * Rich Felker [2019-08-10 12:42:52 -0400]: > On Sat, Aug 10, 2019 at 12:11:11PM +0200, Szabolcs Nagy wrote: > > * Luiz Angelo Daros de Luca [2019-08-10 05:16:19 -0300]: > > > I'm ruby maintainer in OpenWrt 18.06 (musl 1.1.19). I got a bug report ( > > > https://github.com/openwrt/packages/issues/9297) related to musl in mipsel > > > 32bit. > > > > > > When ruby loads a module (.so), it checks if that module was built for the > > > same ruby that is loading it. Ruby loads libruby at startup, which exports > > > ruby_xmalloc sym. So, the check consists on loading the module, searching > > > for ruby_xmalloc in the module context and comparing with global > > > ruby_xmalloc address. If they do not match, the module is using a different > > > libruby. Something like this: > > > > > > handle = (void*)dlopen(file, RTLD_LAZY|RTLD_GLOBAL) > > > void *ex = dlsym(handle, EXTERNAL_PREFIX"ruby_xmalloc"); > > > if (ex && ex != ruby_xmalloc) { > > > // module is incompatible! > > > } > > > > > > The first time a module is loaded, it simply works as expected. > > > I debugged and musl is working nicely. At do_dlsym(struct dso *p, const > > > char *s, void *ra), it correctly fails to find the symbol with: > > > > > > sym = sysv_lookup(s, h, p) > > > > > > and correctly find it with: > > > > > > sysv_lookup(s, h, p->deps[0]) > > > > > > Now, when the second module is loaded, it find "ruby_xmalloc" already with: > > > > > > sym = sysv_lookup(s, h, p) > > > > > > However, sym now points to the address of the undefined symbol in the > > > second library (sym->st_shndx is NULL) instead of searching for it in > > > dependencies. It seems that do_dlsym() only checks for undefined symbol > > > (sym->shndx==NULL) when DL_FDPIC is 1 and DL_FDPIC is 0 in my case. > > > > > > Does it make any sense to return an undefined symbol from dlsym()? > > > Or does it make sense to return an undefined symbol from sysv_lookup()? > > > Or is there any other arch specific issue that happened before, when > > > library was loaded? > > > > yes, if the search involves the main executable then > > st_shndx==0 && st_value!=0 symbols must be included > > because it's a plt in the exe and that's how function > > addresses work.. on most targets except mips. > > > > undef syms have st_value==0 in shared libs, maybe > > not in mips? can you post the readelf -aW output of > > the module that has st_shndx==0 && st_value!=0 entry > > in its dynamic symbol table > > > > i think this was going to be fixed by > > https://www.openwall.com/lists/musl/2017/02/16/1/2 > > but that was never applied. > > I brought it up a few times after that, asking what should be done > since it no longer cleanly applies. The concept of that patch is > probably still right but a localized fix now followed by deduplication > later is probably preferable. > > Do you know if the TLS and STB_LOCAL issues described there still > exist too? i think so (there is no st_shndx check for STT_TLS and no OK_BIND check) > > > > I created a simple patch that skips a symbol if it is undefined. > > > https://raw.githubusercontent.com/luizluca/openwrt/b9674d528513c7c93205fa000fed7c0d3c6bb2e7/toolchain/musl/patches/020-dlsym_donot_return_address_from_undef_sym.patch > > This patch is wrong (on non-MIPS and on MIPS with PLT); it will result > in wrong values for dlsym of a > > > i think the find_sym logic should be copied > > because mips behaves differently from other targets: > > > > http://git.musl-libc.org/cgit/musl/commit/?id=2d8cc92a7cb4a3256ed07d86843388ffd8a882b1 > > Yes. Conceptually, compared to find_sym, need_def is always false for > dlsym (dlsym must return PLT thunk and copy relocation definitions), > and STT_TLS was already checked as a special case above to lookup the > thread-local copy of the object, so the only additional check needed > here is !ARCH_SYM_REJECT_UND(sym). Does that sound correct to you? i think the right check is sym->st_shndx || !ARCH_SYM_REJECT_UND(sym) so the mips plt bit is only checked if st_shndx==0 otherwise bata symbols may be mishandled.