mailing list of musl libc
 help / color / mirror / code / Atom feed
* dynamic linker issue
@ 2015-07-09 15:11 Natanael Copa
  2015-07-09 18:03 ` Rich Felker
  2015-07-09 18:20 ` Felix Janda
  0 siblings, 2 replies; 5+ messages in thread
From: Natanael Copa @ 2015-07-09 15:11 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2129 bytes --]

Hi,

I have a weird issue with libvirtd segfaulting:

BUG at file position route/tc.c:1009:rtnl_tc_register
Assertion failed: 0 (route/tc.c: rtnl_tc_register: 1009)
Aborted (core dumped)

It happens here:
https://github.com/thom311/libnl/blob/48182486341d1de7892494f272e892c0b18ebef5/lib/route/tc.c#L1008

gdb with a breakpoint showed that to_kind is set, but to_type is definitively wrong:
<ncopa> (gdb) print blackhole_ops
<ncopa> $1 = {to_kind = 0x614d43a1026e "blackhole", to_type = 1136841632, to_size = 0, 
<ncopa>   to_dump = {0x0, 0x0, 0x0}, to_msg_fill = 0x0, to_msg_fill_raw = 0x0, 
<ncopa>   to_msg_parser = 0x0, to_free_data = 0x0, to_clone = 0x0, to_list = {
<ncopa>     next = 0x0, prev = 0x0}}

.to_type is initialized here:

https://github.com/thom311/libnl/blob/48182486341d1de7892494f272e892c0b18ebef5/lib/route/qdisc/blackhole.c

So this smells like the dynamic linker is corrupting memory when gnu
hash is used.

I believe it is related the fact that libxenlight pulls in
libnl-route-3.so.200 and libvirt.so.0 pulls in libnl.so.1. those
different versions of libnl libraries provides various symbols with
same name.

I was able to create a smaller testcase, which is attached. Note that
bot libnl-route-3.so.200 and libnl.so.1 provides 'rtnl_addr_alloc'.

Problem happens on alpine edge with those versions:
musl-1.1.10-r2
libnl3-3.2.26-r2
libnl-1.1.4-r0
gcc-5.1.0-r0

I was not able to preproduce it with alpine v3.2 (stable) with those
versions:
libnl3-3.2.25-r0
musl-1.1.9-r3
libnl-1.1.4-r0
gcc-4.9.2-r5

On edge I tried to build the libs with clang and the problem appeared.

We have changed to using gnu hash by default recently:
http://git.alpinelinux.org/cgit/aports/commit/main/binutils?id=ecd6d7d10fc37382bbdd89138199f88429797c7f

More, I tried build various musl versions from git (at least back to
v1.1.7) and problem happens. (I am  only 95% sure i ran the test
properly)

So I suspect there is a bug in musl dynamic linker with gnu hash that
has been there for a long time.

It should be easy to reproduce with the 3 attached testfiles on alpine
edge.

I only tested x86_64.

Ideas?

Thanks!

-nc

[-- Attachment #2: Makefile --]
[-- Type: application/octet-stream, Size: 592 bytes --]

# Makefile
CFLAGS = -fPIC
LIBNL3_CFLAGS := $(shell pkg-config --cflags libnl-route-3.0)
LIBNL3_LIBS := $(shell pkg-config --libs libnl-route-3.0)

LIBNL_CFLAGS := $(shell pkg-config --cflags libnl-1)
LIBNL_LIBS := $(shell pkg-config --libs libnl-1)

run: foo
	LD_LIBRARY_PATH=$(PWD) $(HOME)/src/musl/lib/libc.so -- ./foo

foo: foo.c libfoo3.so
	$(CC) $(CFLAGS) $(LIBNL_CFLAGS) -L. -o $@ $< $(LIBNL_LIBS) -lfoo3

libfoo3.so: libfoo3.c libfoo3.h
	$(CC) $(CFLAGS) $(LIBNL3_CFLAGS) -o $@ -shared $< $(LIBNL3_LIBS)

libfoo3.h:
	echo "int foo_alloc();" >$@

clean:
	rm -f libfoo3.so foo libfoo3.h


[-- Attachment #3: foo.c --]
[-- Type: text/x-c++src, Size: 198 bytes --]

/* foo.c */
#include <netlink/route/addr.h>
#include <stdio.h>
#include "libfoo3.h"

int main(void)
{
	struct rtnl_addr *a = rtnl_addr_alloc();
	printf("foo_alloc: %i\n", foo_alloc());
	return 0;
}

[-- Attachment #4: libfoo3.c --]
[-- Type: text/x-c++src, Size: 150 bytes --]

/* libfoo3.c */
#include <netlink/route/addr.h>

int foo_alloc() {
	struct rtnl_addr *addr = NULL;
	addr = rtnl_addr_alloc();
	return addr != NULL;
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dynamic linker issue
  2015-07-09 15:11 dynamic linker issue Natanael Copa
@ 2015-07-09 18:03 ` Rich Felker
  2015-07-09 18:20 ` Felix Janda
  1 sibling, 0 replies; 5+ messages in thread
From: Rich Felker @ 2015-07-09 18:03 UTC (permalink / raw)
  To: musl

On Thu, Jul 09, 2015 at 05:11:59PM +0200, Natanael Copa wrote:
> Hi,
> 
> I have a weird issue with libvirtd segfaulting:
> 
> BUG at file position route/tc.c:1009:rtnl_tc_register
> Assertion failed: 0 (route/tc.c: rtnl_tc_register: 1009)
> Aborted (core dumped)
> 
> It happens here:
> https://github.com/thom311/libnl/blob/48182486341d1de7892494f272e892c0b18ebef5/lib/route/tc.c#L1008
> 
> gdb with a breakpoint showed that to_kind is set, but to_type is definitively wrong:
> <ncopa> (gdb) print blackhole_ops
> <ncopa> $1 = {to_kind = 0x614d43a1026e "blackhole", to_type = 1136841632, to_size = 0, 
> <ncopa>   to_dump = {0x0, 0x0, 0x0}, to_msg_fill = 0x0, to_msg_fill_raw = 0x0, 
> <ncopa>   to_msg_parser = 0x0, to_free_data = 0x0, to_clone = 0x0, to_list = {
> <ncopa>     next = 0x0, prev = 0x0}}
> 
> ..to_type is initialized here:
> 
> https://github.com/thom311/libnl/blob/48182486341d1de7892494f272e892c0b18ebef5/lib/route/qdisc/blackhole.c
> 
> So this smells like the dynamic linker is corrupting memory when gnu
> hash is used.

I don't see any evidence to believe the problem is gnu_hash, and the
fact that Alpine seems to have patched binutils to suppress generation
of the standard sysv hash tables makes it impossible to test this
hypothesis by disabling the gnu_hash code path without rebuilding the
whole system from source. :( There should not be any way that the
gnu_hash code could result in incorrect writes to memory addresses
that don't even have a relocation.

Can you please reconsider disabling the sysv hash (note: as mentioned
on IRC, this also eliminates the only place to get the size of the
symbol table without iterating over the whole gnu hash table, which
dladdr needs to do and which applications processing ELF structures
might reasonably want to do)? If you have other ideas for debugging
I'm open to trying them but I don't want to spend a lot of time trying
to track down an alleged bug in the dynamic linker when we're missing
a good way to confirm that this is even the cause of the bug.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dynamic linker issue
  2015-07-09 15:11 dynamic linker issue Natanael Copa
  2015-07-09 18:03 ` Rich Felker
@ 2015-07-09 18:20 ` Felix Janda
  2015-07-09 18:28   ` Rich Felker
  1 sibling, 1 reply; 5+ messages in thread
From: Felix Janda @ 2015-07-09 18:20 UTC (permalink / raw)
  To: musl

Can reproduce on gentoo with arm (patched Makefile):

LD_LIBRARY_PATH="/tmp/a:/usr/lib/gcc/armv7a-hardfloat-linux-musleabi/4.7.4/" /usr/lib/libc.so -- ./foo
BUG at file position route/tc.c:978:rtnl_tc_register
Assertion failed: 0 (route/tc.c: rtnl_tc_register: 978)
Makefile:10: recipe for target 'run' failed
make: *** [run] Aborted

libc.so has .hash and .gnu_hash, the others only have .hash.

Felix


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dynamic linker issue
  2015-07-09 18:20 ` Felix Janda
@ 2015-07-09 18:28   ` Rich Felker
  2015-07-09 19:53     ` Natanael Copa
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2015-07-09 18:28 UTC (permalink / raw)
  To: musl

On Thu, Jul 09, 2015 at 08:20:44PM +0200, Felix Janda wrote:
> Can reproduce on gentoo with arm (patched Makefile):
> 
> LD_LIBRARY_PATH="/tmp/a:/usr/lib/gcc/armv7a-hardfloat-linux-musleabi/4.7.4/" /usr/lib/libc.so -- ./foo
> BUG at file position route/tc.c:978:rtnl_tc_register
> Assertion failed: 0 (route/tc.c: rtnl_tc_register: 978)
> Makefile:10: recipe for target 'run' failed
> make: *** [run] Aborted
> 
> libc.so has .hash and .gnu_hash, the others only have .hash.

In that case I don't think gnu_hash is relevant to the crash. It's
probably a libnl3 issue, likely just that it's fundamentally broken to
be linking multiple versions of a lib into the same program if they
use the same symbol names.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dynamic linker issue
  2015-07-09 18:28   ` Rich Felker
@ 2015-07-09 19:53     ` Natanael Copa
  0 siblings, 0 replies; 5+ messages in thread
From: Natanael Copa @ 2015-07-09 19:53 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Thu, 9 Jul 2015 14:28:33 -0400
Rich Felker <dalias@libc.org> wrote:

> On Thu, Jul 09, 2015 at 08:20:44PM +0200, Felix Janda wrote:
> > libc.so has .hash and .gnu_hash, the others only have .hash.
> 
> In that case I don't think gnu_hash is relevant to the crash. It's
> probably a libnl3 issue, likely just that it's fundamentally broken to
> be linking multiple versions of a lib into the same program if they
> use the same symbol names.

I think you are right. Apparently it also happens on glibc:
https://bugs.archlinux.org/task/29921

The configure script for libvirt even has a specific check to avoid
that libvirt and netcf link against different libnl versions:
http://libvirt.org/git/?p=libvirt.git;a=blob;f=configure.ac;h=6533b88851efd5b1842d2beaaefcc254e6fce33d;hb=HEAD#l2616

They don't have the same check for the xen libraries so it must have
been introduced by me who recently enabled libnl3 for xen:
http://git.alpinelinux.org/cgit/aports/commit/main/xen?id=4bf510506bcf0d81e02252991ba61e3837cec3dd

Thanks for testing and helping troubleshoot.

-nc


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-09 19:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-09 15:11 dynamic linker issue Natanael Copa
2015-07-09 18:03 ` Rich Felker
2015-07-09 18:20 ` Felix Janda
2015-07-09 18:28   ` Rich Felker
2015-07-09 19:53     ` Natanael Copa

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).