From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14685 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Max Neunhoeffer Newsgroups: gmane.linux.lib.musl.general Subject: Re: Bug report, concurrency issue on exception with gcc 8.3.0 Date: Wed, 18 Sep 2019 14:45:51 +0200 Message-ID: <20190918124551.lsibbaouordfrddv@zen.arangodb.com> References: <20190917134422.aootviums4hdtell@zen.arangodb.com> <20190917140227.GW9017@brightrain.aerifal.cx> <20190917143510.GX9017@brightrain.aerifal.cx> <20190918071931.lkuf45ltcrdrdxjy@zen.arangodb.com> <20190918092149.GT22009@port70.net> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="13416"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: NeoMutt/20180716 To: musl@lists.openwall.com Original-X-From: musl-return-14701-gllmg-musl=m.gmane.org@lists.openwall.com Wed Sep 18 14:46:10 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1iAZLU-0003KX-51 for gllmg-musl@m.gmane.org; Wed, 18 Sep 2019 14:46:08 +0200 Original-Received: (qmail 27743 invoked by uid 550); 18 Sep 2019 12:46:05 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 27725 invoked from network); 18 Sep 2019 12:46:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arangodb.com; s=google; h=date:from:to:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=C6OBfEW7dYRSk64Pj5yUyj3kFEyVDU/P3vyMkj12z8w=; b=cLW2KB+N/KyGEODzy33UmKQSIw+L4zJrgcxvuk0jp5q4Fbowl9HGxsFYUNAna6kwfj mlfbL2JM0j/yk8oUdmtHWl56UX24462TU5q4+b2P9BQ6gyQy8PnCQwBaK2P4dznEyhJU SLVpt2l22NAcAQz9o58quqDexbkWCD45YqTLxw6qT3rp0it57wxExTrguM9S7eGUFbgE iKPbm7Q5SAF5uGMpMSYoVpSUwF0qpc6vxW6w8K3CrhSbdn1zmrm8qTkpk51t0stdP0Sn dUqBYx9NbJJ9e3A5i9dXfguc0CyAsosk48ZK9J8uW7C8x8vr2WKkSVMYXTav54V6vcb1 4+6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=C6OBfEW7dYRSk64Pj5yUyj3kFEyVDU/P3vyMkj12z8w=; b=lA4tGDv/31eVAXpyGKuwvToCIYkEfsN97dxJxaXq0x+TViOkOzNKlDi6aN57y1u2w9 bVztW+0akFMoH3zYSAX6f8TehHnKN7I7HwUzjRnjt1e9V629Vxh3fAk/ed1eYygVjm1T rHkg5EfsnOCy6xhACiTboYNN5bVz4lPVzxJNmIl0qGW6XI637dcG6zcmJ9bz8yS5OTEx Lt7NrukG62kFkkt1lEuj090SdSBNARyHpMvGegMYKsqHEYB2u49lsMCdvw5DcD4uJu3k NeQUkTIFmSPPh+ELFedMzs6wt0jRBJRXVmjIx/nJLVXHhqKBFuRmTvkim884Yb+NJ/Wo iJug== X-Gm-Message-State: APjAAAVKBl3X1hnwFz0ef7gVPbJPlXsph6XsMNiDs7Rw0Y7UJnZFgoiM l+MofM5GlM+A6djoF/O3NSbIcPbpguDqFCqo1wsVrlUMMpqc6ubJJ+kHfO0iQ4mL+qlgQ0KCjzo YdbHsfr2GrjYOHndnTznTBzKJSTVqGZqToV/2WREX6QQJMrRNFFRbsbg8X4VEdXk= X-Google-Smtp-Source: APXvYqxmfXXHqQAcOuJ/bmlyeUphW2WzrXKbppZtfOJPtt+A8FirQq0ZFuvS3Y5kAn3FmJnAo60dIA== X-Received: by 2002:a1c:4485:: with SMTP id r127mr2570185wma.59.1568810752715; Wed, 18 Sep 2019 05:45:52 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20190918092149.GT22009@port70.net> Xref: news.gmane.org gmane.linux.lib.musl.general:14685 Archived-At: Hello, thank you very much for the explanation. This gives me a temporary way to fix up our application until the bug has been fixed. Cheers, Max. On 19/09/18 11:21, Szabolcs Nagy wrote: > * Max Neunhoeffer [2019-09-18 09:19:31 +0200]: > > thanks for the quick response and for lobbying with the gcc folks! > > > > Did you see the second example program in the original bug report? This > > seems to indicate that there might be an additional problem, since when > > I explicitly use `pthread_cancel` (thereby circumventing the detection > > problem), I get a crash when the first exception is thrown. > > pthread_cancel does not solve the detection problem. > > reference to pthread_cancel only helps with dynamic linking. > in case of static linking you have to explicitly add (strong) > reference to symbols that libgcc_eh.a uses: > > pthread_cancel > pthread_getspecific > pthread_key_create > pthread_mutex_lock > pthread_mutex_unlock > pthread_once > pthread_setspecific > > where pthread_cancel is only needed to make libgcc_eh.a call the > thread functions (but those are all weakrefs so will just be 0 > at runtime unless there are other strong references to them). > > > > > Do you think this is a libgcc problem, too? Should I report this to the > > gcc bug tracker as well? > > > > Cheers, > > Max. > > > > On 19/09/17 10:35, Rich Felker wrote: > > > On Tue, Sep 17, 2019 at 10:02:27AM -0400, Rich Felker wrote: > > > > On Tue, Sep 17, 2019 at 03:44:22PM +0200, Max Neunhoeffer wrote: > > > > > Hello, > > > > > > > > > > I am experiencing problems when linking a large multithreaded C++ application > > > > > statically against libmusl. I am using Alpine Linux 3.10.1 and gcc 8.3.0 > > > > > on X86_64. That is, I am using libmusl 1.1.22-r3 (Alpine Linux versioning) > > > > > and gcc 8.3.0-r0. > > > > > > > > > > Before going into details, here is an overview: > > > > > > > > > > 1. libgcc does not detect correctly that the application is multithreaded, > > > > > since `pthread_cancel` is not linked into the executable. > > > > > As a consequence, the lazy initialization of data structures for stack > > > > > unwinding (FDE tables) is executed without protection of a mutex. > > > > > Therefore, if the very first exception in the program happens to be > > > > > thrown in two threads concurrently, the data structures can be corrupted, > > > > > resulting in a busy loop after `main()` is finished. > > > > > 2. If I make sure that I explicitly link in `pthread_cancel` this problem > > > > > is (almost certainly) gone, however, in certain scenarios this leads > > > > > to a crash when the first exception is thrown. > > > > > > > > > > I had first reported this problem to gcc as a bug against libgcc, but the > > > > > gcc team denies responsibility, see > > > > > [this bug report](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91737). > > > > > > > > This is a gcc bug and needs to be fixed in libgcc. > > > > > > I've updated the gcc tracker with more info, but I seem to lack the > > > ability to reopen the bug myself. > > > > > > To add some more context, using weak references to determine if a > > > library is linked is a dynamic-linking-centric hack and is not > > > compatible with static linking. GCC has historically done this for > > > glibc and other systems where libpthread was a separate library to > > > avoid pulling in a dependency on it, but it's always been broken on > > > glibc with static linking too. Various distros worked around this with > > > horrible hacks as described in Andrew Pinski's reply to your bug > > > report, using binutils tricks to move the whole libpthread.a into a > > > single .o file so that if any of it gets linked it all gets linked. > > > It's possibly upstream glibc adopted this at some point; I'm not sure. > > > But they're in the process of moving the mutex functions to libc > > > instead of libpthread (and maybe even getting rid of libpthread like > > > musl does), so GCC's hacks here won't even provide any benefit with > > > future glibc versions. > > > > > > In any case, this kind of pushback against fixes for clear bugs used > > > to be expected, but things have gotten a lot better with musl being > > > more mainstream nowadays. I think the issue will get resolved quickly > > > once a few more GCC developers look at it. It was actually just > > > reopened while I was writing this email. > > > > > > Rich