From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/6365 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: gthread_once Date: Sat, 18 Oct 2014 15:02:31 -0400 Message-ID: <20141018190231.GG32028@brightrain.aerifal.cx> References: <20141018101813.GB16659@port70.net> <20141018154518.GE32028@brightrain.aerifal.cx> <20141018181430.GC16659@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1413658970 11547 80.91.229.3 (18 Oct 2014 19:02:50 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 18 Oct 2014 19:02:50 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-6378-gllmg-musl=m.gmane.org@lists.openwall.com Sat Oct 18 21:02:45 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1XfZH2-0006zr-Hx for gllmg-musl@plane.gmane.org; Sat, 18 Oct 2014 21:02:44 +0200 Original-Received: (qmail 26418 invoked by uid 550); 18 Oct 2014 19:02:43 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 26410 invoked from network); 18 Oct 2014 19:02:43 -0000 Content-Disposition: inline In-Reply-To: <20141018181430.GC16659@port70.net> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:6365 Archived-At: On Sat, Oct 18, 2014 at 08:14:30PM +0200, Szabolcs Nagy wrote: > * Rich Felker [2014-10-18 11:45:18 -0400]: > > > > If I understand the explanation of what's going on correctly, it's a > > huge bug in libstdc++, and exactly the same as the one that's known > > (and that we've discussed on IRC, etc.) many times in libxml2: > > > > https://bugzilla.gnome.org/show_bug.cgi?id=704904 > > > > It looks to me like gcc is expecting dynamic-linking-like and/or > > bloatware-libpthread-like behavior, where if any one of the pthread > > symbols has a definition, they all do. This is definitely NOT the case > > with static linking, especially with musl. It's also not the case with > > glibc, though they have more indirect dependencies that cause a large > > portion (but not all) of libpthread to be pulled in when you use any > > one part, and some distros (at least Redhat ones, IIRC) link their > > libpthread.a with a hack to merge all the modules into one giant .o > > file inside the .a precisely to work around bugs like this. > > > > If I'm correct, just removing all the weak attributes from the > > gthr-posix.h stuff should fix the bug. > > yes this is the bug > > it turns out gthr-posix.h decides if there are threads > if pthread_cancel is present (weak ref is non-null) > (on glibc it checks pthread_key_create and on bionic it > checks pthread_create) > > if there is no pthread_cancel then libstdc++ will assume > no threads and __gthread calls fail (eg iostream locks > are skipped, in call_once a system_error is thrown) > > if there is pthread_cancel then a __gthread_something call > will call pthread_something and if that is otherwise > unreferenced it will crash (in case of static linking) There seem to be two flaws involved here: First is the use of weak references to determine if a program is "potentially multithreaded" in order to optimize out synchronization when it's not. This is why nsz had to work to find a trick to reproduce the bug. There is no reason to think that pthread_key_create or pthread_cancel is necessarily linked in a multithreaded program, or that they won't be linked in non-multithreaded programs. Testing for pthread_create is actually close to valid -- if they don't care about the case where pthread_create is not present in the main program but loaded dynamically via dlopen. (glibc explicitly does not support such usage, and it can't happen in musl since musl does not support dlopen in static programs and always has pthread_create present in dynamic-linked ones.) To make it more correct, the condition checked should be pthread_create||thrd_create, since a program could become multithreaded via C11 threads rather than POSIX threads. The second issue is the use of weak references to functions which are actually needed. This is identical to the libxml2 bug I cited before: https://bugzilla.gnome.org/show_bug.cgi?id=704904 Using a weak reference is only valid for purposes like "if anything else in the program uses function FOO, we need to use function FOO here too". It's not valid for the seemingly similar purpose of "if anything else in the program uses the library containing FOO, we need to use FOO here". As an implementation detail, this latter pattern happens to work for current implementations of dynamic linking, but it's completely incorrect for static linking and could theoretically be wrong in the future for dynamic linking too. > example that crases here whit -static: > > #include > #include > int (*f)(pthread_t) = pthread_cancel; > int main() > { > std::cout << ""; > } > > possible workaround: create a .o that references all > pthread_* functions and link it into your app > (this is the bloatware-libpthread solution until libstdc++ is fixed) Can't it be fixed just by changing gthr-posix.h not to use weak references ever, and rebuilding gcc's target libs (or all of gcc)? Rich