mailing list of musl libc
 help / color / mirror / code / Atom feed
* dlsym returning unresolved symbol address instead of dependency library symbol address
@ 2019-08-10  8:16 Luiz Angelo Daros de Luca
  2019-08-10 10:11 ` Szabolcs Nagy
  0 siblings, 1 reply; 7+ messages in thread
From: Luiz Angelo Daros de Luca @ 2019-08-10  8:16 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2129 bytes --]

Hello,

I'm ruby maintainer in OpenWrt 18.06 (musl 1.1.19). I got a bug report (
https://github.com/openwrt/packages/issues/9297) related to musl in mipsel
32bit.

When ruby loads a module (.so), it checks if that module was built for the
same ruby that is loading it. Ruby loads libruby at startup, which exports
ruby_xmalloc sym. So, the check consists on loading the module, searching
for ruby_xmalloc in the module context and comparing with global
ruby_xmalloc address. If they do not match, the module is using a different
libruby. Something like this:

handle = (void*)dlopen(file, RTLD_LAZY|RTLD_GLOBAL)
void *ex = dlsym(handle, EXTERNAL_PREFIX"ruby_xmalloc");
if (ex && ex != ruby_xmalloc) {
   // module is incompatible!
}

The first time a module is loaded, it simply works as expected.
I debugged and musl is working nicely. At do_dlsym(struct dso *p, const
char *s, void *ra), it correctly fails to find the symbol with:

sym = sysv_lookup(s, h, p)

and correctly find it with:

sysv_lookup(s, h, p->deps[0])

Now, when the second module is loaded, it find "ruby_xmalloc" already with:

sym = sysv_lookup(s, h, p)

However, sym now points to the address of the undefined symbol in the
second library (sym->st_shndx is NULL) instead of searching for it in
dependencies. It seems that do_dlsym() only checks for undefined symbol
(sym->shndx==NULL) when DL_FDPIC is 1 and DL_FDPIC is 0 in my case.

Does it make any sense to return an undefined symbol from dlsym()?
Or does it make sense to return an undefined symbol from sysv_lookup()?
Or is there any other arch specific issue that happened before, when
library was loaded?

I created a simple patch that skips a symbol if it is undefined.
https://raw.githubusercontent.com/luizluca/openwrt/b9674d528513c7c93205fa000fed7c0d3c6bb2e7/toolchain/musl/patches/020-dlsym_donot_return_address_from_undef_sym.patch

It fixes the issue and it did not break my system(it still boots). However,
I didn't test it with multiples archs nor I runned an extensive test.

I'm not subscribed. Please, CC me.

Regards,
---
     Luiz Angelo Daros de Luca
            luizluca@gmail.com

[-- Attachment #2: Type: text/html, Size: 3110 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dlsym returning unresolved symbol address instead of dependency library symbol address
  2019-08-10  8:16 dlsym returning unresolved symbol address instead of dependency library symbol address Luiz Angelo Daros de Luca
@ 2019-08-10 10:11 ` Szabolcs Nagy
  2019-08-10 12:35   ` Szabolcs Nagy
  2019-08-10 16:42   ` Rich Felker
  0 siblings, 2 replies; 7+ messages in thread
From: Szabolcs Nagy @ 2019-08-10 10:11 UTC (permalink / raw)
  To: musl; +Cc: Luiz Angelo Daros de Luca

* Luiz Angelo Daros de Luca <luizluca@gmail.com> [2019-08-10 05:16:19 -0300]:
> I'm ruby maintainer in OpenWrt 18.06 (musl 1.1.19). I got a bug report (
> https://github.com/openwrt/packages/issues/9297) related to musl in mipsel
> 32bit.
> 
> When ruby loads a module (.so), it checks if that module was built for the
> same ruby that is loading it. Ruby loads libruby at startup, which exports
> ruby_xmalloc sym. So, the check consists on loading the module, searching
> for ruby_xmalloc in the module context and comparing with global
> ruby_xmalloc address. If they do not match, the module is using a different
> libruby. Something like this:
> 
> handle = (void*)dlopen(file, RTLD_LAZY|RTLD_GLOBAL)
> void *ex = dlsym(handle, EXTERNAL_PREFIX"ruby_xmalloc");
> if (ex && ex != ruby_xmalloc) {
>    // module is incompatible!
> }
> 
> The first time a module is loaded, it simply works as expected.
> I debugged and musl is working nicely. At do_dlsym(struct dso *p, const
> char *s, void *ra), it correctly fails to find the symbol with:
> 
> sym = sysv_lookup(s, h, p)
> 
> and correctly find it with:
> 
> sysv_lookup(s, h, p->deps[0])
> 
> Now, when the second module is loaded, it find "ruby_xmalloc" already with:
> 
> sym = sysv_lookup(s, h, p)
> 
> However, sym now points to the address of the undefined symbol in the
> second library (sym->st_shndx is NULL) instead of searching for it in
> dependencies. It seems that do_dlsym() only checks for undefined symbol
> (sym->shndx==NULL) when DL_FDPIC is 1 and DL_FDPIC is 0 in my case.
> 
> Does it make any sense to return an undefined symbol from dlsym()?
> Or does it make sense to return an undefined symbol from sysv_lookup()?
> Or is there any other arch specific issue that happened before, when
> library was loaded?

yes, if the search involves the main executable then
st_shndx==0 && st_value!=0 symbols must be included
because it's a plt in the exe and that's how function
addresses work.. on most targets except mips.

undef syms have st_value==0 in shared libs, maybe
not in mips? can you post the readelf -aW output of
the module that has st_shndx==0 && st_value!=0 entry
in its dynamic symbol table

i think this was going to be fixed by
https://www.openwall.com/lists/musl/2017/02/16/1/2
but that was never applied.

> 
> I created a simple patch that skips a symbol if it is undefined.
> https://raw.githubusercontent.com/luizluca/openwrt/b9674d528513c7c93205fa000fed7c0d3c6bb2e7/toolchain/musl/patches/020-dlsym_donot_return_address_from_undef_sym.patch
> 

i think the find_sym logic should be copied
because mips behaves differently from other targets:

http://git.musl-libc.org/cgit/musl/commit/?id=2d8cc92a7cb4a3256ed07d86843388ffd8a882b1


> It fixes the issue and it did not break my system(it still boots). However,
> I didn't test it with multiples archs nor I runned an extensive test.
> 
> I'm not subscribed. Please, CC me.
> 
> Regards,
> ---
>      Luiz Angelo Daros de Luca
>             luizluca@gmail.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dlsym returning unresolved symbol address instead of dependency library symbol address
  2019-08-10 10:11 ` Szabolcs Nagy
@ 2019-08-10 12:35   ` Szabolcs Nagy
  2019-08-10 16:42   ` Rich Felker
  1 sibling, 0 replies; 7+ messages in thread
From: Szabolcs Nagy @ 2019-08-10 12:35 UTC (permalink / raw)
  To: musl, Luiz Angelo Daros de Luca

* Szabolcs Nagy <nsz@port70.net> [2019-08-10 12:11:11 +0200]:
> * Luiz Angelo Daros de Luca <luizluca@gmail.com> [2019-08-10 05:16:19 -0300]:
> > I'm ruby maintainer in OpenWrt 18.06 (musl 1.1.19). I got a bug report (
> > https://github.com/openwrt/packages/issues/9297) related to musl in mipsel
...
> yes, if the search involves the main executable then
> st_shndx==0 && st_value!=0 symbols must be included
> because it's a plt in the exe and that's how function
> addresses work.. on most targets except mips.
> 
> undef syms have st_value==0 in shared libs, maybe
> not in mips? can you post the readelf -aW output of
> the module that has st_shndx==0 && st_value!=0 entry
> in its dynamic symbol table

ah i see in the bugreport

Buildx86$ staging_dir/toolchain-mips_24kc_gcc-7.3.0_musl/bin/mips-openwrt-linux-musl-readelf -s staging_dir/target-mips_24kc_musl/root-ar71xx/usr/lib/ruby/2.5/mips-linux-gnu/stringio.so | grep mall
91: 00004930 0 FUNC GLOBAL DEFAULT UND ruby_xmalloc
187: 00004930 0 FUNC GLOBAL DEFAULT UND ruby_xmalloc
    ^^^^^^^^^
st_value!=0

that is mips specific strangeness.

it's still not clear to me why is there different
code path between first vs second dlopen, but the
right fix to the reported issue is to reuse the
find_sym logic (since the executable case need not
be handled a somewhat simpler logic may work too,
but i'd prefer a single logic for dlsym and reloc
processing exactly because of broken elf targets
like mips that make maintenance of such code harder)

> 
> i think this was going to be fixed by
> https://www.openwall.com/lists/musl/2017/02/16/1/2
> but that was never applied.
> 
> > 
> > I created a simple patch that skips a symbol if it is undefined.
> > https://raw.githubusercontent.com/luizluca/openwrt/b9674d528513c7c93205fa000fed7c0d3c6bb2e7/toolchain/musl/patches/020-dlsym_donot_return_address_from_undef_sym.patch
> > 
> 
> i think the find_sym logic should be copied
> because mips behaves differently from other targets:
> 
> http://git.musl-libc.org/cgit/musl/commit/?id=2d8cc92a7cb4a3256ed07d86843388ffd8a882b1
> 
> 
> > It fixes the issue and it did not break my system(it still boots). However,
> > I didn't test it with multiples archs nor I runned an extensive test.
> > 
> > I'm not subscribed. Please, CC me.
> > 
> > Regards,
> > ---
> >      Luiz Angelo Daros de Luca
> >             luizluca@gmail.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dlsym returning unresolved symbol address instead of dependency library symbol address
  2019-08-10 10:11 ` Szabolcs Nagy
  2019-08-10 12:35   ` Szabolcs Nagy
@ 2019-08-10 16:42   ` Rich Felker
  2019-08-10 17:27     ` Szabolcs Nagy
  1 sibling, 1 reply; 7+ messages in thread
From: Rich Felker @ 2019-08-10 16:42 UTC (permalink / raw)
  To: musl; +Cc: Luiz Angelo Daros de Luca

On Sat, Aug 10, 2019 at 12:11:11PM +0200, Szabolcs Nagy wrote:
> * Luiz Angelo Daros de Luca <luizluca@gmail.com> [2019-08-10 05:16:19 -0300]:
> > I'm ruby maintainer in OpenWrt 18.06 (musl 1.1.19). I got a bug report (
> > https://github.com/openwrt/packages/issues/9297) related to musl in mipsel
> > 32bit.
> > 
> > When ruby loads a module (.so), it checks if that module was built for the
> > same ruby that is loading it. Ruby loads libruby at startup, which exports
> > ruby_xmalloc sym. So, the check consists on loading the module, searching
> > for ruby_xmalloc in the module context and comparing with global
> > ruby_xmalloc address. If they do not match, the module is using a different
> > libruby. Something like this:
> > 
> > handle = (void*)dlopen(file, RTLD_LAZY|RTLD_GLOBAL)
> > void *ex = dlsym(handle, EXTERNAL_PREFIX"ruby_xmalloc");
> > if (ex && ex != ruby_xmalloc) {
> >    // module is incompatible!
> > }
> > 
> > The first time a module is loaded, it simply works as expected.
> > I debugged and musl is working nicely. At do_dlsym(struct dso *p, const
> > char *s, void *ra), it correctly fails to find the symbol with:
> > 
> > sym = sysv_lookup(s, h, p)
> > 
> > and correctly find it with:
> > 
> > sysv_lookup(s, h, p->deps[0])
> > 
> > Now, when the second module is loaded, it find "ruby_xmalloc" already with:
> > 
> > sym = sysv_lookup(s, h, p)
> > 
> > However, sym now points to the address of the undefined symbol in the
> > second library (sym->st_shndx is NULL) instead of searching for it in
> > dependencies. It seems that do_dlsym() only checks for undefined symbol
> > (sym->shndx==NULL) when DL_FDPIC is 1 and DL_FDPIC is 0 in my case.
> > 
> > Does it make any sense to return an undefined symbol from dlsym()?
> > Or does it make sense to return an undefined symbol from sysv_lookup()?
> > Or is there any other arch specific issue that happened before, when
> > library was loaded?
> 
> yes, if the search involves the main executable then
> st_shndx==0 && st_value!=0 symbols must be included
> because it's a plt in the exe and that's how function
> addresses work.. on most targets except mips.
> 
> undef syms have st_value==0 in shared libs, maybe
> not in mips? can you post the readelf -aW output of
> the module that has st_shndx==0 && st_value!=0 entry
> in its dynamic symbol table
> 
> i think this was going to be fixed by
> https://www.openwall.com/lists/musl/2017/02/16/1/2
> but that was never applied.

I brought it up a few times after that, asking what should be done
since it no longer cleanly applies. The concept of that patch is
probably still right but a localized fix now followed by deduplication
later is probably preferable.

Do you know if the TLS and STB_LOCAL issues described there still
exist too?

> > I created a simple patch that skips a symbol if it is undefined.
> > https://raw.githubusercontent.com/luizluca/openwrt/b9674d528513c7c93205fa000fed7c0d3c6bb2e7/toolchain/musl/patches/020-dlsym_donot_return_address_from_undef_sym.patch

This patch is wrong (on non-MIPS and on MIPS with PLT); it will result
in wrong values for dlsym of a

> i think the find_sym logic should be copied
> because mips behaves differently from other targets:
> 
> http://git.musl-libc.org/cgit/musl/commit/?id=2d8cc92a7cb4a3256ed07d86843388ffd8a882b1

Yes. Conceptually, compared to find_sym, need_def is always false for
dlsym (dlsym must return PLT thunk and copy relocation definitions),
and STT_TLS was already checked as a special case above to lookup the
thread-local copy of the object, so the only additional check needed
here is !ARCH_SYM_REJECT_UND(sym). Does that sound correct to you?

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dlsym returning unresolved symbol address instead of dependency library symbol address
  2019-08-10 16:42   ` Rich Felker
@ 2019-08-10 17:27     ` Szabolcs Nagy
  2019-08-13  7:10       ` Luiz Angelo Daros de Luca
  0 siblings, 1 reply; 7+ messages in thread
From: Szabolcs Nagy @ 2019-08-10 17:27 UTC (permalink / raw)
  To: musl; +Cc: Luiz Angelo Daros de Luca

* Rich Felker <dalias@libc.org> [2019-08-10 12:42:52 -0400]:
> On Sat, Aug 10, 2019 at 12:11:11PM +0200, Szabolcs Nagy wrote:
> > * Luiz Angelo Daros de Luca <luizluca@gmail.com> [2019-08-10 05:16:19 -0300]:
> > > I'm ruby maintainer in OpenWrt 18.06 (musl 1.1.19). I got a bug report (
> > > https://github.com/openwrt/packages/issues/9297) related to musl in mipsel
> > > 32bit.
> > > 
> > > When ruby loads a module (.so), it checks if that module was built for the
> > > same ruby that is loading it. Ruby loads libruby at startup, which exports
> > > ruby_xmalloc sym. So, the check consists on loading the module, searching
> > > for ruby_xmalloc in the module context and comparing with global
> > > ruby_xmalloc address. If they do not match, the module is using a different
> > > libruby. Something like this:
> > > 
> > > handle = (void*)dlopen(file, RTLD_LAZY|RTLD_GLOBAL)
> > > void *ex = dlsym(handle, EXTERNAL_PREFIX"ruby_xmalloc");
> > > if (ex && ex != ruby_xmalloc) {
> > >    // module is incompatible!
> > > }
> > > 
> > > The first time a module is loaded, it simply works as expected.
> > > I debugged and musl is working nicely. At do_dlsym(struct dso *p, const
> > > char *s, void *ra), it correctly fails to find the symbol with:
> > > 
> > > sym = sysv_lookup(s, h, p)
> > > 
> > > and correctly find it with:
> > > 
> > > sysv_lookup(s, h, p->deps[0])
> > > 
> > > Now, when the second module is loaded, it find "ruby_xmalloc" already with:
> > > 
> > > sym = sysv_lookup(s, h, p)
> > > 
> > > However, sym now points to the address of the undefined symbol in the
> > > second library (sym->st_shndx is NULL) instead of searching for it in
> > > dependencies. It seems that do_dlsym() only checks for undefined symbol
> > > (sym->shndx==NULL) when DL_FDPIC is 1 and DL_FDPIC is 0 in my case.
> > > 
> > > Does it make any sense to return an undefined symbol from dlsym()?
> > > Or does it make sense to return an undefined symbol from sysv_lookup()?
> > > Or is there any other arch specific issue that happened before, when
> > > library was loaded?
> > 
> > yes, if the search involves the main executable then
> > st_shndx==0 && st_value!=0 symbols must be included
> > because it's a plt in the exe and that's how function
> > addresses work.. on most targets except mips.
> > 
> > undef syms have st_value==0 in shared libs, maybe
> > not in mips? can you post the readelf -aW output of
> > the module that has st_shndx==0 && st_value!=0 entry
> > in its dynamic symbol table
> > 
> > i think this was going to be fixed by
> > https://www.openwall.com/lists/musl/2017/02/16/1/2
> > but that was never applied.
> 
> I brought it up a few times after that, asking what should be done
> since it no longer cleanly applies. The concept of that patch is
> probably still right but a localized fix now followed by deduplication
> later is probably preferable.
> 
> Do you know if the TLS and STB_LOCAL issues described there still
> exist too?

i think so

(there is no st_shndx check for STT_TLS and no OK_BIND check)

> 
> > > I created a simple patch that skips a symbol if it is undefined.
> > > https://raw.githubusercontent.com/luizluca/openwrt/b9674d528513c7c93205fa000fed7c0d3c6bb2e7/toolchain/musl/patches/020-dlsym_donot_return_address_from_undef_sym.patch
> 
> This patch is wrong (on non-MIPS and on MIPS with PLT); it will result
> in wrong values for dlsym of a
> 
> > i think the find_sym logic should be copied
> > because mips behaves differently from other targets:
> > 
> > http://git.musl-libc.org/cgit/musl/commit/?id=2d8cc92a7cb4a3256ed07d86843388ffd8a882b1
> 
> Yes. Conceptually, compared to find_sym, need_def is always false for
> dlsym (dlsym must return PLT thunk and copy relocation definitions),
> and STT_TLS was already checked as a special case above to lookup the
> thread-local copy of the object, so the only additional check needed
> here is !ARCH_SYM_REJECT_UND(sym). Does that sound correct to you?

i think the right check is

 sym->st_shndx || !ARCH_SYM_REJECT_UND(sym)

so the mips plt bit is only checked if st_shndx==0
otherwise bata symbols may be mishandled.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dlsym returning unresolved symbol address instead of dependency library symbol address
  2019-08-10 17:27     ` Szabolcs Nagy
@ 2019-08-13  7:10       ` Luiz Angelo Daros de Luca
  2019-08-13  8:50         ` Szabolcs Nagy
  0 siblings, 1 reply; 7+ messages in thread
From: Luiz Angelo Daros de Luca @ 2019-08-13  7:10 UTC (permalink / raw)
  To: musl, Luiz Angelo Daros de Luca

[-- Attachment #1: Type: text/plain, Size: 628 bytes --]

Hello,

Thank you all. That was the first time I touched libc. I'm still not sure
what is the best way to deal with my issue.
What would be best way to do it? Should I backport
https://www.openwall.com/lists/musl/2019/08/10/8/1 (it seems to be
self-contained) or simply use an updated version (I hope it is OK now) of
my previous patch
https://raw.githubusercontent.com/luizluca/openwrt/20eba2ebde5b63b1259bf318de76bcd033c07196/toolchain/musl/patches/020-mips-dlsym_donot_return_address_from_undef_sym.patch
?

Both seems to work (at least for MIPS).

Regards,

---
     Luiz Angelo Daros de Luca
            luizluca@gmail.com

[-- Attachment #2: Type: text/html, Size: 1320 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dlsym returning unresolved symbol address instead of dependency library symbol address
  2019-08-13  7:10       ` Luiz Angelo Daros de Luca
@ 2019-08-13  8:50         ` Szabolcs Nagy
  0 siblings, 0 replies; 7+ messages in thread
From: Szabolcs Nagy @ 2019-08-13  8:50 UTC (permalink / raw)
  To: Luiz Angelo Daros de Luca; +Cc: musl

* Luiz Angelo Daros de Luca <luizluca@gmail.com> [2019-08-13 04:10:56 -0300]:
> 
> Thank you all. That was the first time I touched libc. I'm still not sure
> what is the best way to deal with my issue.
> What would be best way to do it? Should I backport
> https://www.openwall.com/lists/musl/2019/08/10/8/1 (it seems to be
> self-contained) or simply use an updated version (I hope it is OK now) of
> my previous patch
> https://raw.githubusercontent.com/luizluca/openwrt/20eba2ebde5b63b1259bf318de76bcd033c07196/toolchain/musl/patches/020-mips-dlsym_donot_return_address_from_undef_sym.patch
> ?

already done

https://www.openwall.com/lists/musl/2019/08/10/8

> 
> Both seems to work (at least for MIPS).
> 
> Regards,
> 
> ---
>      Luiz Angelo Daros de Luca
>             luizluca@gmail.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-08-13  8:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-10  8:16 dlsym returning unresolved symbol address instead of dependency library symbol address Luiz Angelo Daros de Luca
2019-08-10 10:11 ` Szabolcs Nagy
2019-08-10 12:35   ` Szabolcs Nagy
2019-08-10 16:42   ` Rich Felker
2019-08-10 17:27     ` Szabolcs Nagy
2019-08-13  7:10       ` Luiz Angelo Daros de Luca
2019-08-13  8:50         ` Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).