mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: Bug report, concurrency issue on exception with gcc 8.3.0
Date: Tue, 17 Sep 2019 10:02:27 -0400	[thread overview]
Message-ID: <20190917140227.GW9017@brightrain.aerifal.cx> (raw)
In-Reply-To: <20190917134422.aootviums4hdtell@zen.arangodb.com>

On Tue, Sep 17, 2019 at 03:44:22PM +0200, Max Neunhoeffer wrote:
> Hello,
> 
> I am experiencing problems when linking a large multithreaded C++ application
> statically against libmusl. I am using Alpine Linux 3.10.1 and gcc 8.3.0
> on X86_64. That is, I am using libmusl 1.1.22-r3 (Alpine Linux versioning)
> and gcc 8.3.0-r0.
> 
> Before going into details, here is an overview:
> 
> 1. libgcc does not detect correctly that the application is multithreaded,
>    since `pthread_cancel` is not linked into the executable.
>    As a consequence, the lazy initialization of data structures for stack
>    unwinding (FDE tables) is executed without protection of a mutex.
>    Therefore, if the very first exception in the program happens to be
>    thrown in two threads concurrently, the data structures can be corrupted,
>    resulting in a busy loop after `main()` is finished.
> 2. If I make sure that I explicitly link in `pthread_cancel` this problem
>    is (almost certainly) gone, however, in certain scenarios this leads
>    to a crash when the first exception is thrown.
> 
> I had first reported this problem to gcc as a bug against libgcc, but the
> gcc team denies responsibility, see 
> [this bug report](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91737).

This is a gcc bug and needs to be fixed in libgcc.

Rich



> I have produced small sample programs to exhibit the problems, see below for
> a more detailed analysis as to what happens.
> 
> For case 1:
> 
> ------------------------ snip exceptioncollision.cpp ----------------------
> #include <thread>
> #include <atomic>
> #include <chrono>
> 
> std::atomic<int> letsgo{0};
> 
> void waiter() {
>   size_t count = 0;
>   while (letsgo == 0) {
>     ++count;
>   }
>   try {
>     throw 42;
>   } catch (int const& s) {
>   }
> }
> 
> int main(int, char*[]) {
> #ifdef REPAIR
>   try { throw 42; } catch (int const& i) {}
> #endif
>   std::thread t1(waiter);
>   std::thread t2(waiter);
>   std::this_thread::sleep_for(std::chrono::milliseconds(10));
>   letsgo = 1;
>   t1.join();
>   t2.join();
>   return 0;
> }
> ------------------------ snip exceptioncollision.cpp ----------------------
> 
> Use Alpine Linux 3.10.1, for example in a Docker container, and compile
> as follows:
> 
>     g++ exceptioncollision.cpp -o exceptioncollision -O0 -Wall -std=c++14 -lpthread -static
> 
> Then execute the static executable multiple times:
> 
>     while true ; do ./exceptioncollision ; date ; done
> 
> after a few tries it will freeze.
> 
> 
> For case 2:
> 
> ----------------------------------- snip exceptionbang.cpp ---------------
> #include <pthread.h>
> //#include <iostream>
> 
> #ifdef REPAIR
> void* g(void *p) {
>   return p;
> }
> 
> void f() {
>   pthread_t t;
>   pthread_create(&t, nullptr, g, nullptr);
>   pthread_cancel(t);
>   pthread_join(t, nullptr);
> }
> #endif
> 
> int main(int argc, char*[]) {
> #ifdef REPAIR
>   if (argc == -1) { f(); }
> #endif
>   //std::cout << "Hello world!" << std::endl;
>   try { throw 42; } catch(int const& i) {};
>   return 0;
> }
> ----------------------------------- snip exceptionbang.cpp ---------------
> 
> Use Alpine Linux 3.10.1, for example in a Docker container, and compile
> as follows:
> 
>     g++ exceptionbang.cpp -o exceptionbang -Wall -Wextra -O0 -g -std=c++14 -static -DREPAIR=1
> 
> Execute `./exceptionbang` and it will create a segmentation violation.
> 
> Curiously, if you uncomment the line
> 
>     //#include <iostream>
> 
> then more of static initialization code seems to be compiled in and
> all is well.
> 
> More detailed analysis of what is happening:
> 
> Let's look at case 1 first:
> 
> libgcc insists that it is a good idea to check for the presence of
> `pthread_cancel` to detect if the application is multi-threaded. Therefore,
> in my case, since I do not explicitly use `pthread_cancel` and am
> linking statically, the libgcc runtime thinks that the program is
> single-threaded (since `pthread_cancel` is in its own compilation
> unit). As a consequence the mutex
> [here](https://github.com/gcc-mirror/gcc/blob/4ac50a4913ed81cc83a8baf865e49a2c62a5fe5d/libgcc/unwind-dw2-fde.c#L1045) is not actually used.
> 
> Therefore some code in `libgcc`, which is executed when an exception is
> first thrown in the life of the process ([see here](https://github.com/gcc-mirror/gcc/blob/4ac50a4913ed81cc83a8baf865e49a2c62a5fe5d/libgcc/unwind-dw2-fde.c#L1072))
> is not thread-safe and ruins the data structure `seen_objects` rendering
> a singly linked list circular.
> 
> This in the end leads to a busy loop [here](https://github.com/gcc-mirror/gcc/blob/4ac50a4913ed81cc83a8baf865e49a2c62a5fe5d/libgcc/unwind-dw2-fde.c#L221).
> 
> 
> No let's look at case 2:
> 
> I tried to "fix" this by using `pthread_cancel` explicitly. This is how
> I arrived at the second example program `exceptionbang.cpp`. Here, the
> detection is successful detecting a multi-threaded program. However,
> it crashes when the first exception is thrown. I do not understand the
> details, but it seems that the libgcc runtime code stumbles over some
> data structures which are not properly initialized. When including the
> header `iostream`, some more code is compiled in which initializes the
> structures and all is well.
> 
> 
> Please let me know if you need any more information and please Cc me in
> communication about this issue.
> 
> Cheers,
>   Max.


  reply	other threads:[~2019-09-17 14:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17 13:44 Max Neunhoeffer
2019-09-17 14:02 ` Rich Felker [this message]
2019-09-17 14:35   ` Rich Felker
2019-09-18  7:19     ` Max Neunhoeffer
2019-09-18  9:21       ` Szabolcs Nagy
2019-09-18 12:45         ` Max Neunhoeffer
2019-09-24 23:22           ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190917140227.GW9017@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).