From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/6389 Path: news.gmane.org!not-for-mail From: =?UTF-8?B?6buE5bu65b+g?= Newsgroups: gmane.linux.lib.musl.general Subject: Re: musl pthread/tls issue. Date: Fri, 24 Oct 2014 15:35:46 +0800 Message-ID: <544A0152.4040201@i-soft.com.cn> References: <54474F9D.3090306@i-soft.com.cn> <20141022074536.GF16659@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1414136175 5158 80.91.229.3 (24 Oct 2014 07:36:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 24 Oct 2014 07:36:15 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-6402-gllmg-musl=m.gmane.org@lists.openwall.com Fri Oct 24 09:36:09 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1XhZPs-0004xo-Mv for gllmg-musl@plane.gmane.org; Fri, 24 Oct 2014 09:36:08 +0200 Original-Received: (qmail 13367 invoked by uid 550); 24 Oct 2014 07:36:07 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 13350 invoked from network); 24 Oct 2014 07:36:06 -0000 X-QQ-mid: bizesmtp3t1414136148t571t025 X-QQ-SSF: 01400000002000F0F312000A0000000 X-QQ-FEAT: c1sy7WzK7UBbZQf0lg6RJXmf/KhQPJeouKSj8n3zboYyYxIBB4Nfwk3TBVNDd xP1x9il8QBx+gqrep5iG6ghUL1NNFgd2aeD70TYC8eaS95yTXTm3UCawvjPr+lSGRnlzNGz yg3GT+dswOJi+SQFwhkZpjpW0o0hkloysa8o/i8vXnuQht0m+pwn9Vm2qpamywRoGyHuPlX nFFL31aQ1WNT/S1EugvIC X-QQ-GoodBg: 2 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 In-Reply-To: <20141022074536.GF16659@port70.net> X-QQ-SENDSIZE: 520 Xref: news.gmane.org gmane.linux.lib.musl.general:6389 Archived-At: Great clue, Thanks. It's a stack overflow issue. The default pthread stacksize is 81920, that's 80k. I increase the stacksize to 8M and this bug disappear. I had tried add locks, make local copies and even found it's a over flow issue, But so stupid to forget the thread stacksize issue(since it's sufficient defaultly under glibc.) And about the webkit, the different codebase of webkitgtk had different behaviors: 2.4.x run but report a exception of RangeError. 2.6.x(they call it webkitgtk4) use the same codebase as ewebkit, directly segfault. I guess it's related to the "fastmalloc" of JavaScriptCore. On 10/22/14 15:45, Szabolcs Nagy wrote: > * ?????? [2014-10-22 14:33:01 +0800]: >> These days, I finished build a bootable x86_64 system(rpm based) include >> musl/systemd/dracut/gcc-4.9.1/gcc-5/clang-3.5 and wayland/Xorg and the >> whole GNOME-3.14 desktop(except webkit js segfault issue I mentioned >> before) with a lot of patches(I will release all of them someday until >> it reach a stable state.) >> >> After a simple try, I found gnome-shell will segfault If I triggered the >> app list(not always but often). >> >> The dmesg report "pool [] segfault xxxxxxxxxxx >> libpixman-xxxxx", That's to say, it segfault in pixman library(A common >> library used by Xorg and cairo), >> gdb report it's a thread issue(a thread of gnome-shell) and segfault at >> the beginning of general_composite_rect function in pixman-general.c, >> the pointer of argument can not be accessed. >> > that's not enough info.. > > both the webkit js and this crash sounds like thread stack overflow > >> That's to say, there must be a problem exist in musl pthread/tls >> implementation and can be triggered under certain circumstances. Please >> help to solve it. >> > i don't believe that without evidence: general_composite_rect itself > allocates >24k on the stack, that is about a third of the musl default > stack size > > you can verify it by checking the diff of the top and bottom of the stack > (gdb backtrace prints the stack pointer, if the diff is >56k when that > func was entered then this was the problem) or looking at /proc/pid/maps > and if the crash happened in a guard page after a thread stack > > to fix: make the application create a larger thread stack eg 1M > (pthread_attr_setstacksize, but gnome* will use gthread most likely > which has different api) > -- Huang JianZhong