From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8146 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: dynamic linker issue Date: Thu, 9 Jul 2015 14:03:07 -0400 Message-ID: <20150709180307.GT1173@brightrain.aerifal.cx> References: <20150709171159.4f08479e@ncopa-desktop.alpinelinux.org> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1436465014 28686 80.91.229.3 (9 Jul 2015 18:03:34 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 9 Jul 2015 18:03:34 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-8159-gllmg-musl=m.gmane.org@lists.openwall.com Thu Jul 09 20:03:30 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ZDGAS-0000ar-Ru for gllmg-musl@m.gmane.org; Thu, 09 Jul 2015 20:03:29 +0200 Original-Received: (qmail 13924 invoked by uid 550); 9 Jul 2015 18:03:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 13902 invoked from network); 9 Jul 2015 18:03:26 -0000 Content-Disposition: inline In-Reply-To: <20150709171159.4f08479e@ncopa-desktop.alpinelinux.org> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:8146 Archived-At: On Thu, Jul 09, 2015 at 05:11:59PM +0200, Natanael Copa wrote: > Hi, > > I have a weird issue with libvirtd segfaulting: > > BUG at file position route/tc.c:1009:rtnl_tc_register > Assertion failed: 0 (route/tc.c: rtnl_tc_register: 1009) > Aborted (core dumped) > > It happens here: > https://github.com/thom311/libnl/blob/48182486341d1de7892494f272e892c0b18ebef5/lib/route/tc.c#L1008 > > gdb with a breakpoint showed that to_kind is set, but to_type is definitively wrong: > (gdb) print blackhole_ops > $1 = {to_kind = 0x614d43a1026e "blackhole", to_type = 1136841632, to_size = 0, > to_dump = {0x0, 0x0, 0x0}, to_msg_fill = 0x0, to_msg_fill_raw = 0x0, > to_msg_parser = 0x0, to_free_data = 0x0, to_clone = 0x0, to_list = { > next = 0x0, prev = 0x0}} > > ..to_type is initialized here: > > https://github.com/thom311/libnl/blob/48182486341d1de7892494f272e892c0b18ebef5/lib/route/qdisc/blackhole.c > > So this smells like the dynamic linker is corrupting memory when gnu > hash is used. I don't see any evidence to believe the problem is gnu_hash, and the fact that Alpine seems to have patched binutils to suppress generation of the standard sysv hash tables makes it impossible to test this hypothesis by disabling the gnu_hash code path without rebuilding the whole system from source. :( There should not be any way that the gnu_hash code could result in incorrect writes to memory addresses that don't even have a relocation. Can you please reconsider disabling the sysv hash (note: as mentioned on IRC, this also eliminates the only place to get the size of the symbol table without iterating over the whole gnu hash table, which dladdr needs to do and which applications processing ELF structures might reasonably want to do)? If you have other ideas for debugging I'm open to trying them but I don't want to spend a lot of time trying to track down an alleged bug in the dynamic linker when we're missing a good way to confirm that this is even the cause of the bug. Rich